Forum Moderators: Robert Charlton & goodroi
We operate a website which was registered back in 1990s and holds a lot of authority. We also rank top 10 for 500+ keywords in Google from quite a few years and still hold all these rankings. But we are in a bit of dilemma right now as so all the help that we can get will be greatly appreciated.
We recently (2 months ago) did a complete change over of our website to a completely new design which also included somewhat new URL structure for some sections. (For example: old URL: http://www.example.com/sub-page , new URL: http://www.example.com/folder/sub-page/). Note: the new URL structure was not for the whole site, but I would say about 50% would be on the new URL structure. All the proper 301s were in place as well.
As one would expect, after a complete new site changeover the site experiences some drops in rankings etc, but we were back to our regular rankings in a matter of days. We were just going over our google webmaster tools account last week and noticed that the number of 404 requests have just been growing like crazy. Before it used to be around 1000, now it's about 3000. We have gone through all those URLs and found out that quite a few dynamic URLs were left out when preparing the 301 list. An example of one of those: http://www.example.com/folder/folder/page.aspx?section=blahblah. Now in our new design, we didn't include that section and all these pages are coming up as 404. We have a section in mind that we can 301 this to (but the page we are thinking of redirecting to is one of the most important pages as it holds top 10 rankings for numerous competitive keywords).
Now the question that I have is, if we redirect all those dynamic pages that used to exist before (now 404s) to that 1 page, how will Google look at this? Can it harm our rankings for that 1 page as we are 301ing bunch of 404s to that page?
Here are the options that we have:
a) Leave the 404s as it is
b) 301 them to that 1 page
c) disallow that section in robots.txt (google currently has 11,000 of those dynamic URLs in the index but there is no cache of them)
Any help would be greatly appreciated!
Thanks!
the darkroom.
[edited by: tedster at 6:44 pm (utc) on Sep. 25, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]
The robots.txt in addition might be a good idea - no reason to let Google spend part of the crawl budget looking for urls you have chucked out the window.
If there is a page with a lot of links I still like to redirect that incoming flow.
Absolutely - don't throw away the good landing pages, whether it comes from search traffic, direct links or type-ins. The BEST thing to do is either have content right there (no redirect at all) or redirect to a url that has essentially the same content. This question was about redirecting 11,000 urls to one target url.
I also agree with Marcia that 410 is the most technically correct http status for a url that used to exist. Right now, Google treats 404 and 410 in the exact same fashion. But if you are up to it, 410 is still the clearest signal your server can give.
So there probably isn't much in the way of PageRank for those pages, and more than likely not much in the way of inbound linking, but it's a waste to keep getting hammered by bots that keep looking for 404s.
>>So you suggest 301ing all those 11,000 404 pages to one of our main pages which holds numerous top 10 rankings shouldn't be a problem?
I personally would be very uncomfortable redirecting that many to an important page with good rankings, as a "just in case" precaution.
But how about creating a brand new "user friendly" page to 301 redirect those pages to, that can guide any possible visitors to the important pages on the site. Kind of a transitional mini sitemap page to stop the 404 activity from going on and on.
should we still make a new "user friendly" page and then 301 all those 11,000 URLs to that page?
or
can i just block that whole directory in robots.txt so that the bots can't hit em again?
or
just give them a 410 code and do nothing with the robots.txt to block them?
That's the point - they're 404's. They're all returning a 404 Page not found - which means they might be back. Really, it's not a sign of a quality site to have that many missing pages. That's why it's brought up in WebmasterCentral; 404's don't do crawlers any good, they just waste resources - for the engines and webmasters with wasted bandwidth and bloated error logs. A custom 404 is only for missing pages (404), so it has nothing to do with 301.
>>should we still make a new "user friendly" page and then 301 all those 11,000 URLs to that page?
I'm "chicken" so that's what I'd do - or maybe a 303 page replaced (or 410 if it didn't matter).
A 301 is completely different than a 404, it means that the page has moved, not that it's just missing. But actually, what's the most accurate is a 303 (see other) which means the page has been replaced by something else, but I haven't heard much mention of that outside the documentation.
This is an old thread, but still a good one, where all 3 (actually, 4) are referenced. Pay particular attention to jdMorgan's comment on 404's:
[webmasterworld.com...]
[BTW, jdMorgan is Apache web server deity, IMHO; to me, his opinions and posts are like webmaster scripture.]
[edited by: Marcia at 6:56 am (utc) on Sep. 26, 2008]