Forum Moderators: Robert Charlton & goodroi
I run a site that has an active message forum. The forum was moved eight months ago from //www.sitename.com/forum/ to //forum.sitename.com I took care of proper 301 redirects, etc. and it was working great. Googlebot would correctly follow the two robots.txt files located at //sitename.com/robots.txt (or the same file from //www.sitename.com/robots.txt) and the forum's version at //forum.sitename.com/robots.txt
On November 30th I set the preferred domain affinity for sitename.com or www.sitename.com to www.sitename.com. Google's webmaster tools indicated that sitename.com and www.sitename.com were the two invovled. (In contrast, going to the preferred domain page for the site forum.sitename.com shows the options of forum.sitename.com and www.forum.sitename.com.)
I noticed today in webmaster tools thousands of URLs on forum.sitename.com are reported as being blocked by robots.txt. So I decide to test it using the robots.txt analysis tool. None of the supposedly blocked URLs are blocked according to the analysis tool. I'm quite confused at this point.
So I take a snapshot of the robots.txt file from www.sitename.com/robots.txt and test it. I'm quite surprised to learn that this is the the one being used by Googlebot when crawling forum.sitename.com. I couldn't find any documentation to explain this behavior.
Is this a bug or is it designed to work this way? If this is by design is is documented?