Forum Moderators: open
The original question in the original thread referred to underscore vs hyphen, and the replies gravitated towards renaming of pages.
Back in August/September I assumed the role of webmaster on this one site, and my first job was to clean up the pages. They had been abused by too many people, and I wanted the pages to at least start looking similar again.
In my 'enthusiasm' I ended up renaming every page except the index page. Blah!
Once the renamed pages have been spidered and appear in the SE index, won't the old underscored versions just fall out of the index?
I'm no expert, but I assume that given enough time the old pages will indeed fall out of the index. Meanwhile, how many potential customers have I lost? In my case I turned those old pages into doorway pages with their own unique keywords, and after a short delay I redirect the user to the correct page.
Closer to the point I wanted to make, earlier today I was checking my links and a surprisingly large number of those links still point to the old pages. This situation could exist for years, since I've given up on contacting the various sites or even directory sites like DMOZ to get my information updated. As far as I can tell no one has ever updated anything I asked for, descriptions, URL's, nada. Maybe my approach to that problem is wrong...
So the renaming problem is not so much with search engines as it is with existing links. At least that's my opinion.
Anyone else?
grandpa
I would consider looking into making a custom 404 - I do this on a large number of sites, and generally make the 404 a copy of the index page, that way anyone who happens to get into the site via a bad link has a chance to navigate to the information they wanted.
Ideally you would want to have the server issue a 301 redirect response for each renamed file, and send them to the new version. However, with a large number of files this can be a bit daunting.
I sometimes "cheat" and have the missing files redirect via 302 to a site map, and put a Javascript body onload command which redirects java enabled browsers (ie most humans) to the index page.
This saves me the trouble of writing individual 301s for every changed file, and once the new file name versions have been spidered I turn off the 302 and let the spiders read them as 404s.
This is not exactly "by the book" protocol procedure, so keep that in mind.
However, I have a different solution to your problem which will save considerably less of your webspace, and will cause much less confusion to your visitors (I always find it quite confusing when I follow a deep link to a page which isn't there and then I get re-directed to the index page).
Delete all the old pages, and set up a 404 error page with perhaps a short paragraph explaining (in simple terms) what's happened, and how they they can find the page that they were sent to via the index page. Then, provide a clear link to the index page, and allow them to manually continue. This way, you save webspace with having only one page instead of many, and you reduce the amount of confusion that your visitors will encounter.
P.S. If I've misunderstood your thread, then I apologise. It's very late. ^_^
[edited by: hartlandcat at 11:52 pm (utc) on Jan. 4, 2004]
My site is under 100 active pages, space and bandwidth aren't a problem, at least today. But let's say, for the sake of discussion, I have a page named "Sample_Widgets" and I renamed it "Sample-Widgets", and the old page is still linked from several sites and a few directoris.
My old page does have a brief explanation (brief because the redirect is typically 3 to 5 seconds) with a clickable link to the "Sample-Widgets" page as well. My thinking is that the customer really wanted samples, so I don't do a redirect to my index for that reason. Give them what they want.
I do have my 404's redirect to my index, but I'm thinking I want to change that, as suggested, to a custom 404 explaining that the request was invalid, and include a link to the index and link to email me. This is probably better than shooting them off right to the index. It's happened to me, and is little disconcerting.
Beyond all that, since I have the space and bandwidth, it just seems a good option to use those old pages to the most benefit, some of them have decent PR and they all make great spider food. I'm sure if I had 10,000 pages and had to watch my bandwidth I would think differently.
FWIW, I set up my htaccess with 301's a few weeks ago, thinking it was time to clean house. No sooner than I finished that, I read something here at WW that made me change my mind about using that method. I don't remember the details now, something to the effect that too many 301's could be harmful to rankings?
If you have removed the content (rather than just moved it) then a custom 404 is a good idea because users may find their way to other areas of your site so you stand a better chance of keeping them.
Now if there are deep links to any of the old URLs from other sites I'm not sure whether Google or anyone else will "credit" the new page with the links when you do a 301. That I don't know but that is the question to ask IMHO. If this were the case, personally I wouldn't chance it until I knew for sure (or had control of the backlinks in question).
Hope this helps.
Haha. I tried that once and I immediately started getting email from AOL users saying things like:
"I tried to access the site using my 'AOL Favorites' and I got this page, please send me the information from my favorites."
I got about 20 within the first month. 301 or custom 404, but never, ever give them your email.