Welcome to WebmasterWorld Guest from 18.104.22.168
I immediately set up 301s from non-www to www, index.html to site root, and requests for pages with dynamic parameters to non-dynamic URIs just to me sure. I checked that the headers/response codes were being returned correctly, and figured that Google would work it all out eventually.
Rather than getting better, things actually got worse, with a third of the site's pages listed under www, and two thirds under non-www. These page numbers have been stable for at least three months. The www pages rank, the non-www pages don't.
Interestingly, the cache dates for the www pages are current, but the non-www pages were last cached in February 2005, suggesting the 301s are at least having *some* effect.
So I guess that my question is: Should I leave things as they are, 301ing the non-www pages requests to www, or would returning 404s for the non-www requests help Google get things right in the long-term?
If they still rank, and get clicks, then those visitors are still being fed directly through to the correct content page of your site. That is a Good Thing.
If you "404" those URLs then you potentially lose that visitor, as they will have to click round your site navigation to try to find that content again.
Leave the 301 redirects in place, and check back at the end of the year to see how things are.
If any URLs are marked as "Supplemental Results" then they will probably hang around for 2 or 3 years. Make sure that anyone clicking one of those gets to the "real" page where the content now resides.
Things will fix themselves eventually.
If they still rank, and get clicks
They don't, and that's the crux of the problem.
So while I wouldn't lose much in the way of traffic if I 404'd non-www requests, there'd be no point in doing it unless it hurried along their exit from G's index, and consequently allowed the www versions to be relisted.
set up a 301
have one or more decent links to the old URLs
wait for Google to extract head from its butt
Though in the supplemental index I'm still seeing URLs that are two 301's removed from the current URL, the original 301 being done three years ago.
jsavvy293, I hear what you're saying, and if it hadn't gone on so long I'd be more prepared to wait it out. My usual approach with Google is to make sure my house is in order and let them get on with it, but I think I'm about to draw a line in the sand.
Returning a 404 wouldn't necessarily mean just chucking out a 'this page can not be found' page. I could show the correct page for non-www requests but just return a 404 header. I could even do that just for requests from Googlebot ... it isn't cloaking per se. So there's no reason I need to screw over real visitors for Google's sake ...