homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Visit PubCon.com
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Using Allow: / in robots.txt
Google says to use it?

 5:48 pm on May 19, 2003 (gmt 0)

I have never heard of Allow: / being used in the robots.txt, but the google help pages say to use it.


11. How do I block all crawlers except Googlebot from my site?

The following robots.txt file will achieve this for all well-behaved crawlers.

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /



 5:56 pm on May 19, 2003 (gmt 0)

I'd be somewhat hesitant using the Allow directive as I don't think it is supported across the board. Also, the robots.txt file validators that I've checked return this error when including the Allow directive...

Invalid fieldname. There is no Allow.

There has been some discussion on this for quite some time amongst those who govern the robots.txt protocol. As far as I know, it has not been implemented yet.

The default behavior of the robots.txt file is to allow all unless of course you have a Disallow for that resource.

User-agent: Googlebot
Disallow: /

P.S. Hehehe, I've only had to use that directive once! ;)


 6:03 pm on May 19, 2003 (gmt 0)

Yeah I thought it was kind of strange that they had it on there. Everything I have seen mentioning robots.txt says it does not exist. Perhaps google has implemented the use of it?

I have no reason to use the allow, just thought I would point it out.


 6:09 pm on May 19, 2003 (gmt 0)

It's a good catch! Google seems to always be ahead of the pack in implementing search engine specific directives. In this case, I would not place an Allow directive in my robots.txt file since the default behavior is to Allow.

Now, when I can get a valid robots.txt file using the Allow directive, I'll consider reformatting. Until the authoritative resource on the robots.txt file protocol states that Allow is now supported, I think it is best to follow the current standard.


 3:28 am on May 20, 2003 (gmt 0)

Google supports several kinds of extensions to the Standard for Robots Exclusion. Some of them may be life-savers under certain circumstances - making a daunting job trivial in some cases. For example, their support of wildcard filename-matching, in addition to simple (standard) prefix-matching might come in very handy under certain circumstances.

However, I would never use any of these extensions except in an exclusive User-agent: Googlebot record.

There is simply no telling what any other robot might do with those Google-specific extensions!


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved