g1smd - 10:07 am on Oct 10, 2011 (gmt 0)
Yes, I am seeing lots of errors for stuff like
I'm seeing more crawl errors caused by sites, such as Ask, that truncate the URL
www.example.com/If as well as
I am annoyed that such URLs are being requested from the site in the first place, but I am glad to see them listed as "404 Not Found".
For some CMS and blog platforms, these broken URLs will return a blank page with "200 OK" status. Those sites will therefore look "technically broken". Since the URLs return "200 OK", they will not turn up in a report such as this (nor will they turn up when you spider your own site using Xenu Linksleuth or similar).
I can only hope that this is a scenario that triggers the Your site has an unusually high number of URLs message in Google webmastertools. If it does not, then sites may be being held back for their technical failures but without much of a clue that there is a failure in the design and implementation.
I think a number of people reporting substantial drops for their sites (in other recent threads) need to go check their sites to see if they suffer from this technical flaw, and then fix it so that non-existing URLs really do return "404 Not Found".
P.S. I am also seeing requests for
Google is getting WAY too nosey trying to extract every last potential URL from a site. I've built a new set of routines that send "410 Gone" to Google for everything they should not be indexing.
I am also working on dropping the links to /admin and other private and internal URLs when searchengines request public pages.