BENZILLA ATTACKS (vanbeast) wrote in lj_userdoc,
BENZILLA ATTACKS
vanbeast
lj_userdoc

FAQ #50 and Robot/Spider protection

In the course of investigating a support request [188014], I discovered that [FAQ #50] is perhaps a little bit misleading.

It is my opinion that the line If you check this option, your User Info and journal pages will be updated to tell search engine robots to not index your pages. should be made a little more complete. That is, information about what specific steps are taken should be included. For example, a before-and-after:

Original
If you check this option, your User Info and journal pages will be updated to tell search engine robots to not index your pages. If you have a paid account, a robots.txt file will also be generated to block your paid URL (such as http://exampleusername.livejournal.com). Not all robots respect the rules, but most of the popular search sites' robots do.

New and Improved!
If you check this option, your User Info and journal pages will be updated to include META tags which tell search engine robots to not index your pages. Additionally, if you have a paid account, a robots.txt file will be added to your paid URL (such as http://exampleusername.livejournal.com). Not all robots respect the rules, but most of the popular search sites' robots do.

A relatively small change, but an important one. The http://www.livejournal.com/robots.txt file does not protect individual journals, and the text of the FAQ somewhat implies that it does, or that journals get their own robots.txt file. The only protection to non-paid journals is the meta tags embedded in their header.
Subscribe
  • Post a new comment

    Error

    Comments allowed for members only

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 3 comments