Welcome to WebmasterWorld Guest from 54.163.159.27

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

No Allow in Robots.txt

     
7:28 am on Jun 18, 2012 (gmt 0)

Full Member

5+ Year Member

joined:Sept 9, 2010
posts: 231
votes: 0


Thanks to Mark Jackson, [searchenginewatch.com ] who clear a big query about robots.txt.

There's no "/allow" command in the robots.txt file, so there's no need to add it to the robots.txt file.


Most of the SEO experts make thise mistake of adding Allow in robots.txt
2:43 pm on June 18, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 21, 1999
posts:38066
votes: 15


Google supports the nonstandard 'Allow'.

However, using it is risky as some bots based on older robots.txt libraries will interp it as a "disallow"..
[tools.seobook.com...]
8:59 pm on June 18, 2012 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:11659
votes: 254


Most of the SEO experts make thise mistake of adding Allow in robots.txt

I don't know about most experts elsewhere, but site search on WebmasterWorld suggests that "expert" opinion here cautions against using "allow" unless you're very careful about it.

Here are two threads, one old, one recent, with some comments on the topic that are worth reading....

Using Allow: / in robots.txt
Google says to use it?
May , 2003
http://www.webmasterworld.com/forum93/15.htm [webmasterworld.com]

Google supports several kinds of extensions to the Standard for Robots Exclusion. Some of them may be life-savers under certain circumstances - making a daunting job trivial in some cases. For example, their support of wildcard filename-matching, in addition to simple (standard) prefix-matching might come in very handy under certain circumstances.

However, I would never use any of these extensions except in an exclusive User-agent: Googlebot record.

There is simply no telling what any other robot might do with those Google-specific extensions!


Lost all rankings from Google - due to robots.txt
June, 2012
http://www.webmasterworld.com/google/4463765.htm [webmasterworld.com]

As this is the "Robots Exclusion Protocol" everything hinges on this being a disallow list.

...Even though both Bing and Google say they now support a few extensions to the standard syntax, the actual current standard is explained here: [robotstxt.org...]

...and here is Google's Help page: [support.google.com...] If you start blocking some URLs or URL patterns, the details Google provides can become important for getting the exact results that you intended.
4:41 am on June 19, 2012 (gmt 0)

Full Member

5+ Year Member

joined:Sept 9, 2010
posts: 231
votes: 0


Means post published at SEW gives wrong information, after reading this Google support page I find we can use Allow in robots.txt
[support.google.com ]
7:16 am on June 19, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10596
votes: 22


the more egregious error in that article is discussing robots.txt as a method of controlling indexing for a site when it is actually used to control crawling, not indexing.

and the fact that it is actually a robots exclusion protocol makes google's "Allow:" extension fundamentally unsound.
11:31 pm on June 20, 2012 (gmt 0)

Administrator from US 

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 25, 2005
posts:14650
votes: 94


and the fact that it is actually a robots exclusion protocol makes google's "Allow:" extension fundamentally unsound.


Allow just makes it easier to punch small holes in the firewall, albeit non-standard ones. Besides, Google added a lot of extra crap to robots.txt that many don't support. Needs to be standardized and to this day, best I know, it's not so the whole thing is moot really.

Without a script to back up enforcing robots.txt it's quite useless really.
9:56 am on June 26, 2012 (gmt 0)

Full Member

5+ Year Member

joined:Sept 9, 2010
posts: 231
votes: 0


Google itself use Allow in robots.txt [google.com ]
12:58 pm on June 26, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Use that only with Googlebot user agent.

When you have section for Googlebot, google uses only that section of the robots.txt file.