Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Googlebot bug? robots.txt and a subdomain

Setting preferred domain results in Googblebot using wrong robots.txt

         

Drew_Black

5:50 pm on Dec 3, 2006 (gmt 0)

10+ Year Member



I think there's a bug in Google's choice of which robots.txt file is used when the preferred domain is set in Webmaster Tools.

I run a site that has an active message forum. The forum was moved eight months ago from //www.sitename.com/forum/ to //forum.sitename.com I took care of proper 301 redirects, etc. and it was working great. Googlebot would correctly follow the two robots.txt files located at //sitename.com/robots.txt (or the same file from //www.sitename.com/robots.txt) and the forum's version at //forum.sitename.com/robots.txt

On November 30th I set the preferred domain affinity for sitename.com or www.sitename.com to www.sitename.com. Google's webmaster tools indicated that sitename.com and www.sitename.com were the two invovled. (In contrast, going to the preferred domain page for the site forum.sitename.com shows the options of forum.sitename.com and www.forum.sitename.com.)

I noticed today in webmaster tools thousands of URLs on forum.sitename.com are reported as being blocked by robots.txt. So I decide to test it using the robots.txt analysis tool. None of the supposedly blocked URLs are blocked according to the analysis tool. I'm quite confused at this point.

So I take a snapshot of the robots.txt file from www.sitename.com/robots.txt and test it. I'm quite surprised to learn that this is the the one being used by Googlebot when crawling forum.sitename.com. I couldn't find any documentation to explain this behavior.

Is this a bug or is it designed to work this way? If this is by design is is documented?

g1smd

12:17 am on Dec 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That could partly explain several other "cross site" indexing issues that other people have been reporting for months.

Seems like Google has screwed something else up again. Their bug list seems to keep on growing.

Drew_Black

3:29 am on Dec 4, 2006 (gmt 0)

10+ Year Member



I disabled the preferred domain setting. I'll report back in a few days when I can tell if Googlebot switches back to using the proper robots.txt file.