Welcome to WebmasterWorld Guest from 18.104.22.168
I have searched the forums so please don' tell me to do that again. This stuff is everywhere in webmasterworld and I can't make a lot of sense behind it at least for my situation.
Just to let you know I do a lot of SEO but have never been in this kind of position before.
We started a whole new web design on the same domain (the last one was a mess with files everywhere). My problem is it is a PR8 website with thousands of inbound links to thousands of pages. I am taking the project over from someone that has just left the company.
They edited the .htaccess file so the 404 error page redirects to the homepage. Then they started deleting pages, directories, and everything else that was old. I almost lost my mind when I found out. This started over a month ago so restoring from backups is not an option. At least not without losing anything we have changed since then.
I have changed htaccess file back and customized the 404 error page with a link to the home page, sitemap and a nice friendly 404 message for our users.
Back on the SEO side I am dying with all of those inbound links going to a 404 error. And I know we are losing rankings in Google and Page Rank will drop on the next update.
My question is... is there anyway to avoid this and not upset the Google gods. From everything I have read just doing a 404 redirect to home page is a Google no, no But I want to preserve the passed on PR from inbound links.
Is there a way do a permanent redirect for pages not found to the homepage without making the bots mad? That way will save the PR AND make users happy.
What is the correct way to do this without knowing the old path? My job and few others depend on this. Same old story I did not screw this up but I need to fix it ASAP.
The correct solution is to 301-Moved Permanently redirect all removed pages to their logical replacement in the new site design.
For those with no direct replacement, you might logically be able to redirect them to 'category' pages or 'subject' pages -- I don't know your site, so pick whichever fits best.
For those with no replacement at all, then the 410-Gone/404 Not Found page with a link to the homepage and the site map is the way to go. (Use 410-Gone for HTTP/1.1, 404-Not Found for HTTP/1.0 requests.)
Another alternative to 410/404 might be to pass the request to your site search script if you have one. This would make sense if your previous design had a lot of keyword-in-URL paths, for example.
Remember that SEs index URLs, not pages and not files -- just URLs. If they find several URLs that all lead to the same content, then many of those URLs will go supplemental or be ignored. If the number is large, you may actually get a penalty, as the duplicates may be seen as doorway pages for the redirect destination page. I can't tell you what that 'large' number might be, though, because I have never tempted the gods myself. But having a large number of URLs redirected to your home page is not going to be all that beneficial, anyway.
You said it's too late to restore the deleted pages, but do sit down and map out how their old URLs might be pointed to sensible replacements in the new design. Yes, you may end up with a ton of redirect directives, but the need for them will fade over time as the new pages garner their own inbound links, and you feel you can 'sacrifice' a few more of the old URLs every month by removing the redirects.
Just make sure the replacement page will satisfy a large majority of visitors and not leave them thinking, "What the heck am I doing on this page? -- It's got nothing to do with what I clicked!" Even a 404 may be better than that.
For those who are planning such a project (and I realize that you're stuck with things the way they are now, uptil7000), the correct answer [w3.org] was posted in 1998 by Tim Berners-Lee, who invented (or simultaneously co-invented, along with R. Cailliau) this whole World-Wide-Web thing: Never, ever change your URLs unless you are prepared to pay the consequences of losing your inbounds, bookmarks, and search rankings -- at least for several months. Mod_alias and mod_rewrite give you the capability to change your directory structure, filenames, and even your site technology (e.g. static html -> dynamic php) in any way you please without ever changing your URLs in any way. Allow several weeks or months to brainstorm out your new site structure to accomodate growth in site size and traffic, a shift in targeting, division of maintenance responsibilities/privileges, caching, robots exclusion, and site technology without requiring you to re-architect your file structure or URL structure.
And while you're at it, make sure that no new site is Web-accessible in any way until all domain and page canonicalization issues are handled (one page, one -and only one- URL), the robots.txt file and cache-control headers are fully in place, and private directories are set up with password protection. Until then, pull 'the internet cord' or password-protect everything on the new server, and keep the old site running until you throw the switch.
Hope that helps -- Feel free to post over in a more search-focused forum for more and better advice... :)