Googlebot bug? robots.txt and a subdomain

I think there's a bug in Google's choice of which robots.txt file is used when the preferred domain is set in Webmaster Tools.

I run a site that has an active message forum. The forum was moved eight months ago from //www.sitename.com/forum/ to //forum.sitename.com I took care of proper 301 redirects, etc. and it was working great. Googlebot would correctly follow the two robots.txt files located at //sitename.com/robots.txt (or the same file from //www.sitename.com/robots.txt) and the forum's version at //forum.sitename.com/robots.txt

On November 30th I set the preferred domain affinity for sitename.com or www.sitename.com to www.sitename.com. Google's webmaster tools indicated that sitename.com and www.sitename.com were the two invovled. (In contrast, going to the preferred domain page for the site forum.sitename.com shows the options of forum.sitename.com and www.forum.sitename.com.)

I noticed today in webmaster tools thousands of URLs on forum.sitename.com are reported as being blocked by robots.txt. So I decide to test it using the robots.txt analysis tool. None of the supposedly blocked URLs are blocked according to the analysis tool. I'm quite confused at this point.

So I take a snapshot of the robots.txt file from www.sitename.com/robots.txt and test it. I'm quite surprised to learn that this is the the one being used by Googlebot when crawling forum.sitename.com. I couldn't find any documentation to explain this behavior.

Is this a bug or is it designed to work this way? If this is by design is is documented?

Googlebot bug? robots.txt and a subdomain

Setting preferred domain results in Googblebot using wrong robots.txt

Drew_Black

g1smd

Drew_Black

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week