Page is a not externally linkable
garyr_h - 12:36 am on Apr 22, 2012 (gmt 0)
There have been problems for a few months now.
1) The added%20 at the end of URLs, they are truly nonexistent on the links the bot supposedly follows. I ended up 301ing them to the correct page.
2) The obscure search engines having listings like:
Title of Page (linked to actual http://www.example.com/page.html page, correct URI)
Text from page, like a normal search engine.
http://www.example.com/p... (text only, but not displaying the entire URI)
G bot now crawls the text URI and reports it as an error, even though the correct URI is directly above it within the text. It results in hundreds and hundreds of 404 errors from every single obscure search engine out there.
3) G bot crawling their own SERPs incorrectly. Every once in a while, they'll be crawling some RSS feed of their own search results, but instead try to crawl http://www.example.com/page.html<web:URL_whatever_the_RESULT_BELOW_MINE
Which again, sometimes results in hundreds of 404s.