Page is a not externally linkable
- Search Engines
-- Sitemaps, Meta Data, and robots.txt
---- Yahoo! Slurp Now Supports Wildcards in robots.txt


lexipixel - 7:12 pm on Nov 7, 2006 (gmt 0)


My question was aimed more about what the differences were between the three main robot rules.

-bouncybunny

From your first post, it appears you want to just allow all 'bots to crawl and index everything on your site:

User-agent: *

Specifying the User-Agent: rule is only half of it --- you also need to Allow/Disallow some or all directories where the 'bots can go.


User-agent: *
Disallow:

..as I said before, "Disallow: " (with nothing specified to 'disallow'), is in effect a double-negative; "to not disallow" is the same as to "allow"...

Further, a ROBOTS.TXT file containing only:


User-agent: *
Disallow:

is pretty much the same as having no ROBOTS.TXT at all, (except for all the 404 error that will be generated by 'bots requesting the file if you don't have one).

ROBOTS.TXT is the control file for "Standards for Robots Exclusion" ---- the rules were written to keep certain bots out of certain file areas, ("exclude")... the default operation of most bots is INDEX, FOLLOW (everything).


Thread source:: http://www.webmasterworld.com/robots_txt/3144662.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com