Welcome to WebmasterWorld Guest from 54.196.153.46

Forum Moderators: Robert Charlton & aakk9999 & andy langton & goodroi

Message Too Old, No Replies

Site Restructuring, 404s and Google

     
4:43 am on Sep 15, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 31, 2011
posts:45
votes: 1


After a CMS change I 301 directed all aspects of the site I wanted to keep to the appropriate new URL. Google is spidering the old pages which no longer exist, thus given 404 errors.

How does one go about informing google that the pages it is looking for are no longer valid?

Thanks
6:19 am on Sept 15, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


A 404 or 410 status is what you need - sounds like you're fine to me. It just takes a while for the crawling to slow down on the legacy URLs that aren't re-published elsewhere.

In fact, googelbot will occasionally request those old URLs for years, but at a much slower frequency. Don't worry about it unless you see a 404 status when you think that URL should be a 301.

And by the way - a 404 is a kind of "error" for a crawler - but it's an error you intended to happen. So it's not the kind of "error" that you need to fix, unless you intended that URL to resolve. It's just included in the report for your information.

[edited by: tedster at 9:30 pm (utc) on Sep 15, 2012]

1:48 pm on Sept 15, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 31, 2011
posts:45
votes: 1


Thanks, that makes me feel better. *Thumbs Up*
9:19 pm on Sept 15, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


There's a persistent rumor that google gets the message faster with 410s than with 404s. A 410 is intentional; a 404 is the generic "can't find it". So if you can do it without making your htaccess balloon to thousands of lines, include explicit 410s for the pages you're not redirecting. And make sure to specify a nice 410 page for the humans. It can even be the same physical page as the 404 page. Don't let them get the Apache default; it's scary. (And the IIS default is probably scarier. Their error messages always make me think something went seriously wrong in the deepest recesses of the server.)
9:56 pm on Sept 15, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


404 - the server can't find it, doesn't know if it ever was here, and has no idea whether it might come back in the future.

410 - it's gone and it ain't coming back (though Google comment that an unnervingly large number of URLs that have returned 410 in the past do at some point come back to life again - and that's why they spider them forever).
5:36 am on Sept 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 31, 2011
posts:45
votes: 1


Thanks for the additional info. I went ahead and knocked out the unwanted pages with a RedirectMatch 410.htaccess entry. (To a custom page just in case any human eyes hit it.)
7:41 am on Sept 16, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


If you have any RewriteRules in your htaccess file you should not use any Redirect or RedirectMatch directives. Directives are processed in "per module" order and not in the order written in the htaccess file and so you cannot guarantee module execution order. There was much longer discussion on these points in another thread here only yesterday.
8:52 am on Sept 16, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13218
votes: 348


There have also been multiple discussions about the (un)wisdom of mixing up redirects-- regardless of mechanism-- with error documents. You want the page to return a 410, not a 302.
5:52 pm on Sept 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 31, 2011
posts:45
votes: 1


RedirectMatch was used for regex matching. I could not get the test pages to work with RewriteRule.
6:08 pm on Sept 16, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


That points either to a syntax error or rules in the wrong order, or both.
6:39 pm on Sept 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:Mar 31, 2011
posts:45
votes: 1


Thanks for the advice. I will check my syntax.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members