Welcome to WebmasterWorld Guest from 3.93.75.30

Forum Moderators: goodroi

Message Too Old, No Replies

Robots and Redirects

Does disallow block a 301 redirection?

     
10:58 pm on Mar 25, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 7, 2003
posts: 1085
votes: 10


Let's say I have a site where I block a certain directory.

such as:

User-agent: *
Disallow: /special/

What happens if I have a 301 redirect in place for an old file that points to a new file within that very same blocked directory?

such as:

redirect permanent oldsite.zom/special/blue-stuff.html [newsite.zom...]

will Google follow that 301 and start to spider + index that page alone? Or does it now have license to spider + index content within that directory?

or will it simply be blocked leaving the old SERP in Google?

...indications are that it will be blocked. Is that correct?

If that is true, are there any options - for instance can i use the allow directive for Googlebot to allow that single page within a directory that is blocked?

As always - thanks in advance!
12:12 am on Mar 26, 2015 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:11822
votes: 236


googlebot and other well-behaved spiders will respect the robots exclusion protocol and not request any urls that match Disallowed patterns.
therefore googlebot will never see the 301 response.

can i use the allow directive for Googlebot to allow that single page within a directory that is blocked?

google, bing and most other bots support the Allow directive.

Robots.txt Specifications - Webmasters Google Developers:
http://developers.google.com/webmasters/control-crawl-index/docs/robots_txt [developers.google.com]
How to Create a Robots.txt File - Bing Webmaster Tools:
http://www.bing.com/webmaster/help/how-to-create-a-robots-txt-file-cb7c31ec [bing.com]
Using robots.txt Yandex.Help. Webmaster:
http://help.yandex.com/webmaster/controlling-robot/robots-txt.xml [help.yandex.com]
blekko:
http://blekko.com/about/blekkobot [blekko.com]
Baiduspider:
http://help.baidu.com/question?prod_en=master&class=498&id=1000550 [help.baidu.com]
Majestic-12 : DSearch : MJ12bot:
http://www.majestic12.co.uk/projects/dsearch/mj12bot.php [majestic12.co.uk]

however, not all do.
for example, DuckDuckGo uses WWW::RobotRules (which adheres to The Robots Exclusion Protocol [robotstxt.org] and doesn't support the Allow directive):
http://metacpan.org/pod/WWW::RobotRules [metacpan.org]
4:03 am on Mar 26, 2015 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:15810
votes: 847


When you, as a human, meet a redirect, your browser sends you along to the new URL without asking if that's what you want. That's the browser doing its job. But a robot-- including a search-engine spider-- has a different job. It requests an URL and then makes note of the response. If the response is "I've moved, so go over here" (as in a 301 redirect), the robot then has to make a separate decision about whether to request this new URL. One thing that factors into the decision is whether the robot is, in fact, allowed to request the second URL.

Illustration: A while back, I moved my personal site to a new domain name. Since this meant that a certain roboted-out directory no longer existed at the old site, I removed the robots.txt block. As soon as search engines made this discovery, they went wild with excitement and requested all the pages they knew about in this directory. But all they got was a 301 telling them to go over to the new site ... where the equivalent URL was duly roboted-out. Net result: A lot of redirects, but no follow-up requests.
3:11 pm on Mar 26, 2015 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Mar 7, 2003
posts: 1085
votes: 10


Thanks Phrank + Lucy!
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members