Sit it out or make changes
When Google's www/non-www problems were first discussed on WebmasterWorld, I found that one of my sites was affected, with a small number of pages indexed under non-www.
I immediately set up 301s from non-www to www, index.html to site root, and requests for pages with dynamic parameters to non-dynamic URIs just to me sure. I checked that the headers/response codes were being returned correctly, and figured that Google would work it all out eventually.
Rather than getting better, things actually got worse, with a third of the site's pages listed under www, and two thirds under non-www. These page numbers have been stable for at least three months. The www pages rank, the non-www pages don't.
Interestingly, the cache dates for the www pages are current, but the non-www pages were last cached in February 2005, suggesting the 301s are at least having *some* effect.
So I guess that my question is: Should I leave things as they are, 301ing the non-www pages requests to www, or would returning 404s for the non-www requests help Google get things right in the long-term?
It seems that Google can take a very long time to drop the unwanted URLs from their index.
If they still rank, and get clicks, then those visitors are still being fed directly through to the correct content page of your site. That is a Good Thing.
If you "404" those URLs then you potentially lose that visitor, as they will have to click round your site navigation to try to find that content again.
Leave the 301 redirects in place, and check back at the end of the year to see how things are.
If any URLs are marked as "Supplemental Results" then they will probably hang around for 2 or 3 years. Make sure that anyone clicking one of those gets to the "real" page where the content now resides.
Things will fix themselves eventually.
|If they still rank, and get clicks |
They don't, and that's the crux of the problem.
So while I wouldn't lose much in the way of traffic if I 404'd non-www requests, there'd be no point in doing it unless it hurried along their exit from G's index, and consequently allowed the www versions to be relisted.
"Things will fix themselves eventually"
I don't think google has a clue what a 301 redirect is. Either they ignor them or don't understand them.
No. The problem with 301 redirects isn't the redirect, or Google's interpretation of it.
The problem is the damned Supplemental Index where they dump old URLs and cause all sorts of problems.
G1 if you hear this,
I tried to sticky mail you, but your box if full.
I am looking for some PAID HTaccess help.
Please sticky me.
404's don't do anything. It sounds like you aren't linking to the 301ed URLs. If you don't link (basically forever) to the URLs you want redirected, they won't be. If the URLs are supplemental though, it won't matter.
set up a 301
have one or more decent links to the old URLs
wait for Google to extract head from its butt
Jetboy, while the 404 will be a quick(er) fix in removing the URLS from the index, you will lose all legacy link stregnth in doing so. I'd stick it out with the 301 if I were you, though just a warning, I 301'd quite a few large sites to different domains last Jan/Feb (2005) and I'm still seeing the original site's URL rank occasionally (not supplementals).
Though in the supplemental index I'm still seeing URLs that are two 301's removed from the current URL, the original 301 being done three years ago.
steveb, the www destination URIs are linked into a shallow and well linked site hierarchy, but few of the pages will have deep backlinks (from external sites). However, the 301ed non-www pages *are* supplementals.
jsavvy293, I hear what you're saying, and if it hadn't gone on so long I'd be more prepared to wait it out. My usual approach with Google is to make sure my house is in order and let them get on with it, but I think I'm about to draw a line in the sand.
Returning a 404 wouldn't necessarily mean just chucking out a 'this page can not be found' page. I could show the correct page for non-www requests but just return a 404 header. I could even do that just for requests from Googlebot ... it isn't cloaking per se. So there's no reason I need to screw over real visitors for Google's sake ...