Forum Moderators: open

Message Too Old, No Replies

Is this valid robots.txt?

valid robots.txt * asterisk disallow

         

pwc21

3:14 pm on Jul 1, 2002 (gmt 0)



Hi,

My company operates a small robot, and recently we receieved a complaint from a site owner stating we are ignoring his robots.txt file.

I can state very clearly we do not ignore robots.txt, I'm the one that wrote the java code that processes it. However we are not doing as the person expects, because we believe their robots.txt file is invalid.

The question is, is this a valid robot.txt file to exclude everyone from everything:

User-agent: *
Disallow: *

In strict terms I say no because it should be:

User-agent: *
Disallow: /

Some robots may obey this, I am considering changing ours to obey the * also.

Any opinions?

Thanks,
Paul

korkus2000

3:17 pm on Jul 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Welcome to WebmasterWorld pwc21,

The second is correct. There are a few spiders like slysearch that only see "Disallow: *" even though it should be "Disallow: /".

Here is a good resource for robots.txt
[searchengineworld.com...]

I would program the bot for both.

Mardi_Gras

3:36 pm on Jul 1, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Paul - as Korkus said, #2 is correct. From robotstxt.org:

To exclude all robots from the entire server
User-agent: *
Disallow: /

In addition to the link to Brett's robots.txt checker, you might take a look at Web Server Administrator's Guide to the Robots Exclusion Protocol at www.robotstxt.org/wc/exclusion-admin.html