| 5:56 pm on May 19, 2003 (gmt 0)|
I'd be somewhat hesitant using the Allow directive as I don't think it is supported across the board. Also, the robots.txt file validators that I've checked return this error when including the Allow directive...
Invalid fieldname. There is no Allow.
There has been some discussion on this for quite some time amongst those who govern the robots.txt protocol. As far as I know, it has not been implemented yet.
The default behavior of the robots.txt file is to allow all unless of course you have a Disallow for that resource.
P.S. Hehehe, I've only had to use that directive once! ;)
| 6:03 pm on May 19, 2003 (gmt 0)|
Yeah I thought it was kind of strange that they had it on there. Everything I have seen mentioning robots.txt says it does not exist. Perhaps google has implemented the use of it?
I have no reason to use the allow, just thought I would point it out.
| 6:09 pm on May 19, 2003 (gmt 0)|
It's a good catch! Google seems to always be ahead of the pack in implementing search engine specific directives. In this case, I would not place an Allow directive in my robots.txt file since the default behavior is to Allow.
Now, when I can get a valid robots.txt file using the Allow directive, I'll consider reformatting. Until the authoritative resource on the robots.txt file protocol states that Allow is now supported, I think it is best to follow the current standard.
| 3:28 am on May 20, 2003 (gmt 0)|
Google supports several kinds of extensions to the Standard for Robots Exclusion. Some of them may be life-savers under certain circumstances - making a daunting job trivial in some cases. For example, their support of wildcard filename-matching, in addition to simple (standard) prefix-matching might come in very handy under certain circumstances.
However, I would never use any of these extensions except in an exclusive User-agent: Googlebot record.
There is simply no telling what any other robot might do with those Google-specific extensions!