Today we got a new client who pretty much disappeared from Google on 17th of May whilst previously ranking on page 1 for their main keywords. Home page is nowhere to be found and their traffic dropped 90%.
Quick check of the site showed that all requests for non-existing pages result in HTTP 500 instead of returning 404. This in turn resulted that a request for the non-existing robots.txt is also returning HTTP 500, which stopped google crawling the site. Crawl history shows that until 7th of May Google did crawl the site, hence the error with incorrect status code must have happened 6th or 7th of May.
I have found an old thread on this here [webmasterworld.com
However, what is interesting now is that it took Google only 10 days to drop the home page from its index
- whilst the previous thread linked above (5 years old) said Google still kept old cache of pages in its index 4 months after robots.txt HTTP 500 error and their indexed pages did not disappear from index.
Posting this just as the information to others who may wonder why unexplained drop in their traffic.
The best way to check if robots.txt is a problem is to use "Fetch as Googlebot" in WMT and fetch the home page and robots.txt file. If you get message "unreachable robots.txt" then this could be the problem even if robots.txt does not exist or never existed on the site - in which case go and check your response codes!
Also note that "Blocked URLs" option in WMT that "tests" the robots.txt is not a good way to test this particular case as it still reports home page as "Allowed". Clarified the date of disappearing - added month