Forum Moderators: goodroi
I started a blog in December, it was indexed by Google often when I started with an XML sitemap. Within the last 2 weeks, though, I have dropped out of the rankings, because Google doesn't connect with my robots.txt file. The funny thing is that I never had one originally, then added one to help the crawling process.
Google now continually says "robots.txt" timeout in the webmaster tools, and will not find the robots file. I have tested it with header checkers, and have visited the robots file myself, and each time I received a 200 reply that it was good. I don't know what to do now, because my web host doesn't seem to believe the problem is on their end.
Any suggestions? I am grateful for your help.
have you tried any GW tools for robots.txt analysis [google.com]?
Yes, I've deleted the robots.txt file, and the strange thing is that I get redirected to a folder called robots.txt/
I've used a header tester to check the file, and it returns 301 with the file deleted, and 200 when it is there. With the robots.txt file in place, Google wm tools says it can't connect to the file. Not a 404, but a confused Googlebot.
It's for these reasons that I think my web host has blocked Google somehow (IP is 66.249 I think).
Thanks for the help!
Is it possible to test a file from specific IP address? Ie. can I copy Google's IP and watch the response I get? That would definitively give me an answer.
Yes, but no.
You could fake up a packet with Google's IP address, but then your server would send the response to Google, not to you. And since Google did not send the request, their firewall is likely to reject (ignore) your server's response. Not much use...
This sounds like the result of a bug in Google's report of your robots.txt status.
Jim
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /blog/
RewriteCond %{REQUEST_FILENAME}!-f
RewriteCond %{REQUEST_FILENAME}!-d
RewriteRule . /blog/index.php [L]
</IfModule>
ErrorDocument 404 /index.php
Though my hosting provider had a hard time believing, there WAS in fact an IP block in the firewall for Google that was preventing access. It has been removed, and Google has successfully downloaded the robots file. I'll now be waiting for my sitemap to download, and I'll be back in business.
Thanks for all the responses, everyone, I will definitely be back to this forum should I have another problem.
When I go to his blog in IE7 it connects then about 3 secounds latter I get kicked out to my perferred search engine with the url
http :// http :// www his site com/robots.txt like I typed this url in the browser.
It is like google searches the robots.txt file before loading the site and seeing or getting a bad url from the robots.txt it is kicking me out. Really strange.
I asked him who did the file he doesn't have a clue and said he didn't do it it looks like a custom file but I am not sure if wordpress comes with this file already for install with the purchase of the blog.
Doesn't do it in firefox only windows and he said I am the only one that ever said that. I wonder if it is because I have more enabled in my google tool bar than the normal person does and the Google bar searches out the robots text file when I visit a site.
I am just wondering here as I was reading this thread and figured I could ask the question since he has found his issue.
I have never run across this before and am really puzzled by this.