|How to redesign 450 page site and not lose ranking|
| 12:35 pm on Aug 12, 2007 (gmt 0)|
Hi Newbie here:
I have been asked to redesign a 450 page website packed with content - some good but most of it packed there by the original webmaster with hopes that it would generate traffic. The website owners have been happy with their traffic and people finding them thru Search Engines, but they are NOT happy with their crazy - HUGE - disorganized website.
I am trying desperatly to weed thru the site's site map in order to figure out what to "throw away", but I am terrified that the pages I don't include in the new site will damage the website's traffic and search engine ranking.
How can I redesign an ENORMOUS 450 page website without loosing traffic and search engine ranking?
Keeping all 450 pages would be nuts right?
Thanks for any light you can shed on this problem :-(
| 2:04 pm on Aug 12, 2007 (gmt 0)|
Two main points:
For pages which offer substantially the same content, keep the URLs unchanged -- at least for nine months to a year. Note that a URL is not a filename; You can use mod_rewrite on Apache or ISAPI Rewrite on IIS to map old URLs to new files (If you are on IIS, make sure that ISAPI rewrite is available on your server before committing to this approach).
Don't change everything all at once. Link to new pages from old, then let the new pages 'age' for awhile before taking down the old pages and 301-redirecting their URLs to appropriate replacement-page URLs -- again, the time scale is many months here.
It is tempting to introduce a redesigned site (of which you are proud) as an "opus" -- a new masterwork to be appreciated all at once as a whole. But that is something that search engines don't like at all.
| 2:10 pm on Aug 12, 2007 (gmt 0)|
Thanks for your reply..
I've been weeding thru the site's traffic reports and see that there are very consistant Entry Pages each month. So this is telling me to keep the Top Entry Pages on the new redesigned site - keeping both the file name (page.html) and the URL (http://www.site/directoryname/page.html) exactly the same. Does it sound like I'm on the right path here?
Thanks again for helping our the Newbie :-)
| 3:03 pm on Aug 12, 2007 (gmt 0)|
File names are irrelevant, except inside the server. As noted above, there are mechanisms available to map any URL to any filename on Apache and IIS. The relationship between a URL (a Web link) and a file (or a script) inside the server is not a direct relationship; URLs can be mapped to filenames any way you like.
This makes it possible, for example, to replace an old static-html site with a new PHP-and-database-driven site without changing the old URLs [w3.org] in any way. Unfortunately, most Webmasters are unaware of this, and many of them manage to destroy their search ranking for a year or more when undertaking such a site redesign project.
| 3:36 pm on Aug 12, 2007 (gmt 0)|
Sorry to sound so "newbie" Jim - especially since you're been so kind...
But the biggest hurdle I'm trying to figure out is, "removing" a large amount of old, outdated html pages from the site, without jepardizing the site's ranking.
Should I keep the old pages as they are on the server? Should I remove them from the server?
See the newly redesigned HMTL site will not include a ton of the older outdated html pages because we just have to have a managable smaller site - not 450 pages of old "crap".
I guess I need to trouble you to walk me thru Mod-rewrite on an Apace server because I'm not understanding how to do that or how it really works in my situation, where I'd have hundreds of old pages that I don't want to include on the newly designed site anymore - but am fearing the loss of searh engine ranking.
Do I need root access to accomplish Mod-Rewrite?
Thanks for your patience.
| 4:26 pm on Aug 12, 2007 (gmt 0)|
Tough question to answer -- I don't know the site.
Only you can decide whether these old pages have any value. Try considering them as 'archival material' -- Are they of any use as such? If so, you can move them to an 'archive' directory, and add a page header to them that states that they're somewhat/mostly/completely outdated (as applicable). Sitting in that archive directory, they'll be out of the way, and need little or no maintenance. As previously noted, putting them in a different filepath need have no effect on their URLs.
The paper I cited, by the man credited with inventing the WWW, reflects the 'academic mindset' of the Web. Although the Web has since been more-or-less taken over by commercial interests, it's important to realize that this academic mindset persists, especially at search engine companies; They view the Web as a library of information, and not as a temporary roadside billboard sign or a street-corner magazine/newspaper kiosk. For this reason, they noticeably favor persistent content, and the kinds of sites that host persistent content.
To illustrate, a librarian does not go through the library and toss out books just because they are old -- Imagine if we'd tossed all copies of Shakespeare for that reason alone...
Divide the pages of this site into classes according to what makes sense: Pages to keep
Pages to archive
Pages to replace
Pages to remove
If a page is to be removed (and I suggest that any page that might be of any historical or research value to anyone be retained) then install a 301-Moved Permanently redirect to one or more of: A strongly-similar page
A category-listing page providing links to similar (newer) pages on the subject
Your site map page
A site-search function
Your home page (only as a very last resort, and only for a very few pages -- See "duplicate content")
Your old pages may offer valuable information for people trying to discover what your industry was like five years ago. They may offer you the benefit of PageRank and Link-Pop they've accrued over the years. They may serve as a traffic draw and/or as link-bait because of their content.
On the other hand, they may indeed be totally useless, but only you can decide that. However, the last criteria I would consider is the "convenience" of their maintenance.
Again, the above is generalized -- and perhaps to the point of irrelevance; I don't know anything about the site.
| 6:17 pm on Aug 12, 2007 (gmt 0)|
Everything you said makes perfect sense Jim and thank you...the only thing I honestly don't understand is when you say "putting them in a different filepath need have no effect on their URLs"...
If I have a well ranked html page:
and I move wellranked.html to:
That is a totally different URL - isn't it?
Sorry, I just don't get that one...thanks again for all of your great insight.
[edited by: encyclo at 11:05 pm (utc) on Aug. 12, 2007]
[edit reason] switched to example.com [/edit]
| 7:12 pm on Aug 12, 2007 (gmt 0)|
Yes, but only because you changed the URL. Please don't confuse URLs with filenames, and please don't change URLs unnecessarily.
URLs are used to locate 'resources' --pages, images, multimedia, etc.-- on the Web. They are meaningless inside a server. Filenames are used to locate files, either data files or executable (e.g. script) files inside a server, and are meaningless on the Web. Simply put, the fundamental job of a server is to accept a URL request and translate that URL to a filesystem path.
This seems to be a difficult concept to convey, but let's take a simple, common example:
Let's say your homepage URL is http://example.com/
There is no such location in your server, though, since no disk drive or filename appears in that URL.
So, when a request for this URL arrives at your server, the server removes the now-unneeded "http://example.com" part, and adds the partial filepath specified by the server's DocumentRoot configuration directive, "C://Program Files/Apache/httpd/dev-sites/my-site" (on a server running on a Windows PC, for example, just to keep on familiar ground here) to the remaining "/".
So far, we now have "C://Program Files/Apache/httpd/dev-sites/my-site/" as the partially-translated filepath.
However, we're still missing any filename, because "/" isn't a filename.
So, the server uses the value defined by the DirectoryIndex configuration directive, and finds that your default index file is called "index.html". So it adds that to complete the filepath.
The completely-resolved filepath is now "C://Program Files/Apache/httpd/dev-sites/my-site/index.html".
So, the URL is
which resolves to the server filepath
The one and only surviving token from the URL that appears in the filepath is a slash...
So again, URLs and filenames are not the same thing, and need not have any fixed relationship with each other.
It's important to grasp this concept because the successful use of mod_rewrite or ISAPI Rewrite depends on it. And far from being a pedantic distinction, it's important to business as well, as you will discover if you change all your URLs and tank your site's rankings...
So, to re-cast your question in orthodox terminology:
|If I have a well ranked html page (file): |
and I replace that file with:
How do I tell the server about the new file location, while retaining the same URL?
The answer is to use mod_rewrite to do an internal rewrite:
RewriteRule ^Directory/wellranked\.html$ /Totally_Different_Directory/wellranked.html [L]
I haven't shown the one or two 'overhead' directives required to enable mod_rewrite here, but having already enabled mod_rewrite, that one RewriteRule directive is all that's needed to tell the server where to find the new file associated with the requested URL. And if URLs share common features, they can be rewritten as classes or groups; One rewrite can handle several (or even all) requested URLs. For example, if all HTML files in /Directory are moved to /Totally_Different_Directory, then you could use the single mod_rewrite directive:
RewriteRule ^Directory/([^.]+)\.html$ /Totally_Different_Directory/$1.html [L]
to rewrite all of them.
When executed, the $1 token in the new substitution path (on the right) will take the value of the requested URL-path that matches the first parenthesized subpattern in the RewriteRule regular-expressions pattern (on the left).
In addition, you can define a few exclusions, if needed, by using mod_rewrite's conditional-rewriting directive, RewriteCond.
| 9:50 pm on Aug 12, 2007 (gmt 0)|
Thank you, thank you, thank you Jim...I'll post my success story soon!
| 8:51 am on Aug 14, 2007 (gmt 0)|
I would just like to say, this is a very well posted thread. I was just hired by a website with a similar situation.
Even though I have experience with and know how to deal with the above situation, I wish I could afford jdMorgan.
Anyway, what I mean to say is very good suggestions. Right on target buddy...