Forum Moderators: Robert Charlton & goodroi
Checking crawl stats in WMT appears to show crawling coming to a halt at the beginning of this month.
The site is up and running and Google Analytics show normal visitor numbers and I have not made any changes to the site lately. I don't have access to raw logs so cannot check responses to the bot.
Any ideas what is happening?
Possible troubles could come from changes you didn't make on the server, so it would be good to check in real time just to what happens. Sometimes a web host takes a step to try to protect you from bat bots, for instance, and ends up making a problem for good bots. If you actually have no robots.txt file, you should see a 404 status code returned - and possibly your web host is now doing something other than returning a 404 status.
Did the checks and robots was coming back as 404 and all other pages as 200.
Also used the robots analysis tool in WMT which didn't appear to show that there was no robots, but gave the robots text as blank, and was allowing pages. It also showed that robots.txt was last downloaded a few minutes ago.
Decided the easiest thing to do was create a robots file and hope that solves the problem.
[paraphrased]
The IP address 66.249.72.162 (googlebot) was blocked at the server firewall because of anomalous behaviour that triggered Apache mod_security. We are now allowing that IP address to crawl the server's websites again.
I assume that all G bots were blocked as there had been no crawling since the 1st March.
I also assume that everything was being blocked, but as G looks for robots.txt first, if there is a problem (even if robots.txt doesn't exist) WMT shows it as a Robots error.
[edited by: tedster at 1:25 am (utc) on Mar. 18, 2009]
[edit reason] paraphrase email quote [/edit]