| 6:19 am on Sep 15, 2012 (gmt 0)|
A 404 or 410 status is what you need - sounds like you're fine to me. It just takes a while for the crawling to slow down on the legacy URLs that aren't re-published elsewhere.
In fact, googelbot will occasionally request those old URLs for years, but at a much slower frequency. Don't worry about it unless you see a 404 status when you think that URL should be a 301.
And by the way - a 404 is a kind of "error" for a crawler - but it's an error you intended to happen. So it's not the kind of "error" that you need to fix, unless you intended that URL to resolve. It's just included in the report for your information.
[edited by: tedster at 9:30 pm (utc) on Sep 15, 2012]
| 1:48 pm on Sep 15, 2012 (gmt 0)|
Thanks, that makes me feel better. *Thumbs Up*
| 9:19 pm on Sep 15, 2012 (gmt 0)|
There's a persistent rumor that google gets the message faster with 410s than with 404s. A 410 is intentional; a 404 is the generic "can't find it". So if you can do it without making your htaccess balloon to thousands of lines, include explicit 410s for the pages you're not redirecting. And make sure to specify a nice 410 page for the humans. It can even be the same physical page as the 404 page. Don't let them get the Apache default; it's scary. (And the IIS default is probably scarier. Their error messages always make me think something went seriously wrong in the deepest recesses of the server.)
| 9:56 pm on Sep 15, 2012 (gmt 0)|
404 - the server can't find it, doesn't know if it ever was here, and has no idea whether it might come back in the future.
410 - it's gone and it ain't coming back (though Google comment that an unnervingly large number of URLs that have returned 410 in the past do at some point come back to life again - and that's why they spider them forever).
| 5:36 am on Sep 16, 2012 (gmt 0)|
Thanks for the additional info. I went ahead and knocked out the unwanted pages with a RedirectMatch 410.htaccess entry. (To a custom page just in case any human eyes hit it.)
| 7:41 am on Sep 16, 2012 (gmt 0)|
If you have any RewriteRules in your htaccess file you should not use any Redirect or RedirectMatch directives. Directives are processed in "per module" order and not in the order written in the htaccess file and so you cannot guarantee module execution order. There was much longer discussion on these points in another thread here only yesterday.
| 8:52 am on Sep 16, 2012 (gmt 0)|
There have also been multiple discussions about the (un)wisdom of mixing up redirects-- regardless of mechanism-- with error documents. You want the page to return a 410, not a 302.
| 5:52 pm on Sep 16, 2012 (gmt 0)|
RedirectMatch was used for regex matching. I could not get the test pages to work with RewriteRule.
| 6:08 pm on Sep 16, 2012 (gmt 0)|
That points either to a syntax error or rules in the wrong order, or both.
| 6:39 pm on Sep 16, 2012 (gmt 0)|
Thanks for the advice. I will check my syntax.