Forum Moderators: goodroi

Message Too Old, No Replies

Allow rather than Disallow

Is it possible?

         

Marshall

8:34 pm on Jun 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Since there are sooooo many "bad" robots which can make a robot.txt file really long, is it acceptable to blanket disallow robots then selectively allow robots. Example:

User-agent: *
Disallow: /

User-agent: GoodRobot
Disallow:

User-agent: GoodRobot2
Disallow:

And so on....

I admit, I did not research this - I'm just curious. If any of you robot.txt experts know, I'd appreciate an answer.

Thanks in advance,
Marshall

ogletree

9:10 pm on Jun 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There is an allow you can put in

User-Agent: Googlebot
Disallow: /folder1/
Allow: /folder1/myfile.html

Not all bots recognize this since it is not in the robots.txt standard. Google added [google.com] this to their bots.

Marshall

10:15 pm on Jun 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



ogletree,

I know about the Google non-standard allow. What I am curious about is if the way I presented it in my post would work: first disallow all then allow just a few specifc. And on a related note, would the order matter: as I had it shown or reversed: allow selected robots first then disallow all second. Sorry if I seem a pain.

Marshall

jimbeetle

10:23 pm on Jun 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Robots should obey the first directive applicable to them, then move on. So, if you reverse your entries (and conflate it a bit)...

User-agent: GoodRobot
User-agent: GoodRobot2
Disallow:

User-agent: *
Disallow: /

...you should be okay.

Marshall

10:30 pm on Jun 7, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is the 'should' in italics a "legal disclaimer?" ;)