homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Why does Google WMT say that robots.txt is unreachable?

 11:11 pm on Jul 14, 2008 (gmt 0)

I have had top positions for years in Google for many keywords on my site...until the last week in June, where my site disappeared in the SERPS. All whitehat stuff on site; I don't know what would have caused such a drastic change as I've done little changes to the site other than add/modify products and content for new products. I created a Google webmaster tools account a few days ago. In the unreachable URLs section I see most of my urls listed including my home page, with the detail stating "robots.txt file unreachable".

I used the analyze robots.txt tool -robots.txt file is error free; Although it was last downloaded on June 22. I don't understand why Google would be having a problem downloading it now - hasn't changed in years. Google's definition for unreachable is somewhat vague and and I'm not sure where to go from here.

I've looked at the logs and everything seems to look ok with a 200 response for each googlebot request. But on the 23rd of June and up to present, Googlebot requests the robots.txt file and then leaves and doesn't download any files.

I've contacted my host to check to see if they're doing any IP Blocking and they say they're not, but will look into it further.

Is this a glitch? A penalty?




 3:09 am on Jul 15, 2008 (gmt 0)

WMT seems to lose things. I don't often go there and was surprised to see some of my sites were unverified. I thought it was because G has renamed the verification file to uppercase while I had the old lowercase filename. Renamed the file and WMT is happy.


 12:17 pm on Jul 15, 2008 (gmt 0)

My site is still verified. Googlebot must not be reading the robots.txt file because there are urls listed that have the path through the cgi-bin folder. My ecommerce software uses JS to start it and injects a adding pathway through the cgi-bin - for example. If a user would go to a product page /123.html and the ecommerce software wasn't already in the path, it would refresh the page and add /cgi-local/softcart.exe/123.html?E+scstore.

What I find in the unreachable urls section are 2 pathways for the same page :/cgi-local/softcart.exe/123.html?E+scstore and /123.html
I thought Googlebot doesn't execute JS. In years past, I have added Disallow: /cgi-local/ to my robots.txt file which solved that issue.

I have other ecommerce sites running with the same ecommerce software without a hitch.
Is Googlebot ignoring robots.txt and then considering this duplicate content?
Should I remove the /cgi-local/softcart.exe urls that are listed in WMT?
How can I further test whether googlebot is really having a problem reading my robots.txt file or its some other problem?


 8:46 pm on Jul 15, 2008 (gmt 0)

If you're seeing urls in WMT that are blocked by robots.txt, you might try a url removal request, based on robots.txt and see what results you get from that.


 9:37 pm on Jul 15, 2008 (gmt 0)

That's a good idea. I'll try that. But that still doesn't explain why googlebot's ignoring the robots.txt file. If I'm violating some Google guideline I'm unaware of would WMT still give me the same nebulous response "robots.txt file unreachable"?


 9:54 pm on Jul 15, 2008 (gmt 0)

Have you tried the robots.txt tool that Google offers within your WMT account? That may give you some clues.

My ecommerce software uses JS to start it and injects a adding pathway through the cgi-bin

There may well be some kind of technical tangle in the javascript area. Although they are working on it, googlebot does not usually work with javascript.


 11:52 am on Jul 16, 2008 (gmt 0)

Yes, I have used the robots.txt tool. It shows my current file is valid with no errors. The javascript on my pages should be clean as they have not been a problem nor edited in a long time.

I guess the next steps is to figure out whether there are unrelated (to the robots.tx file) possible problems that would trigger this error.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved