Page is a not externally linkable
- Search Engines
-- Sitemaps, Meta Data, and robots.txt
---- Yahoo! Slurp Now Supports Wildcards in robots.txt


Reid - 2:54 am on Nov 20, 2006 (gmt 0)


it appears to me that yahoo and google use the wildcard in exacly the same way, I'm not sure about MSN but they do seem to be following the robots.txt 'standard'. So there are no differences when using the * or the $ between googlebot and slurp but using the wildcard in directives within a User-Agent: * would confuse all other bots who dont understand * within a URL. So the User-Agent: * becomes User-Agent: (all who understand wildcards within URL's)
So I wouldn't use wildcard URL's for User-Agent: *
So now if we want wildcards in URL's we have to

User-Agent: googlebot
Disallow: /*thing

User-Agent: slurp
Disallow: /*thing

User-Agent: *
Disallow: /something
Disallow: /anything
Disallow: /anotherthing

kindof redundant, I agree with Jim, they could do a lot if they got together and made a more complex standard that everyone could agree on.
Also the use of Allow is useless since the default is Allow, thats also part of the 'standard' but pretty useless in my opinion.

[edited by: Reid at 3:18 am (utc) on Nov. 20, 2006]


Thread source:: http://www.webmasterworld.com/robots_txt/3144662.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com