Forum Moderators: Robert Charlton & goodroi
Thanks.
[edited by: tedster at 2:40 pm (utc) on April 20, 2007]
[edit reason] switch to example.com - it will never be owned [/edit]
You're right to have this issue in mind - and during the transition from your old urls being indexed to the new ones being indexed, both will be there for a period of time. It's unavoidable.
There's a good thread from back in January called URL structure redesign [webmasterworld.com] where jdMorgan gave this excellent approach:
Doing things in the right order and at the right time is everything:Add code to rewrite (not redirect) the new friendly URLs to the old unfriendly ones needed by your script(s). Change the links on your pages to use those new friendly URLs. Get your responsive linking partners to link to the new friendly URLs. Let this sit awhile, until you see the new URLs appear consistently in the SERPs for important pages. Add code to 301 (permanently) redirect the unfriendly URLs to the friendly ones to handle non-updated inbound links. Don't take exceptional measures to do this fast or all at once, or you can "pull the rug out from under your site" in search. Proceed slowly and very deliberately with regard to your top-ranking pages and main landing pages.
Someone here (I wish I could remember who, so as to give credit) has argued that starting with updating the links on your lowest-level, least-important pages (at step 2 in the list above) is a good plan, and I tend to agree -- build new internal supports for your top pages before removing the old supports. On a per-page basis, consider this a balancing act between maintaining the PageRank/link-pop support for a page, and avoiding long-term duplicate (old & new) URLs for the same page. This should work well for sites with a small number of well-ranked landing pages, and lots of supporting pages below -- for example, an e-commerce site with a few "main" pages and categories, and lots of product pages below that.
[webmasterworld.com...]
For technical issues with the url rewriting itself, that's an Apache Forum matter. Here's an excellent primer on the topic: Changing Dynamic URLs to Static URLs [webmasterworld.com]
The redirect generates a 301 code back to the browser. You see the URL change in the address bar of the browser.
The rewrite internally rewrites the URL to fetch the correct content, but doesn't show that rewritten URL. You see the original URL you requested.
If you request example.com/directory/file the the server actually pulls the data from example.com/index.php?page=dir/file but doesn't show you that internally rewritten URL.
If you request example.com/index.php?page=dir/file then the server issues a 301 redirect to example.com/directory/file and you see the URL in the browser change to be that one. The server then uses the internal rewrite (as above) to get you the content.
Because one of them is a rewrite there is no possibility of there being a loop. If both were redirects then it would always loop forever.
the big problem is that when google picked up the mods traffic dropped off by 70%, this was 2 months ago and it still hasnt recovered. We did not realise when they first set up the mods, no 301s were put in place. These were set up about two weeks ago, problem being that for 5 - 6 weeks both old and new urls were available.
I dont think dup. content is a problem but we are concerned as to why google isnt happy with the new urls. Sounds like we pretty much did what you recommend above but maybe a bit late with the 301s?
yahoo and msn picked up the mods quickly.
also set up an xml sitemap and submitted but doesnt seem to have made any difference.
Any help appreciated.
<Sorry, no specifics.
See Forum Charter [webmasterworld.com]>
[edited by: tedster at 4:14 pm (utc) on April 23, 2007]
There is practically a guarantee of a loop, even in the case of a rewrite and a redirect, unless the code that implements the external redirect examines the server variable THE_REQUEST before redirecting. See the threads cited above for details of the correct implementation to avoid this problem.
The same problem can occur when a redirect and a DirectoryIndex directive conflict, and the solution is identical: Do not redirect unless the 'incorrect' URL was received from the client (browser), rather than generated as the result of a server directive (internal rewrite or DirectoryIndex).
> Sounds like we pretty much did what you recommend above but maybe a bit late with the 301s?
If the roof of my house is on fire, I would prefer that the firemen extinguish it before it spreads to the whole house... even if they arrive late on the scene. No matter how you do it, changing URLs is likely to result in a temporary impact on the ranking of pages on your site. But you must balance the long-term gain against this short-term pain. And as to being late with the redirects, the old phrase, "Better late than never" applies.
Jim
Don't panic when that happens. It is normal. They may continue to show like that for a year.
Your measure of success is in seeing that the new URLs are not Supplemental and that as many as possible are indexed.
1.) Add the dynamic url to my robots exclusion list. This stops the url from being followed or indexed.
User-agent: *
Disallow: /cgi-bin/mydir/mypage.cgi
or if you rewrite all URL's exclude all dynamic urls.
User-agent: *
Disallow: /cgi-bin/
2.) I add a variable (isstatic) to the rewrite, then do a check in the code to see if the variable is 'yes'. If it is yes then I know the url was invoked from .htaccess.
Rewrite the dynamic url in .htaccess:
RewriteEngine on
RewriteBase /cgi-bin/mydir/
RewriteRule ^(.+)\.html$ mypage.cgi?action=myaction&id=$1&isstatic=yes
Static URL:
www.mysite.com/content/13.html
3.) I use perl/cgi so in my .cgi file I run a check at the top of the file or sub-routine looking for the "isstatic" variable. If I don't find it I include the noindex meta tag.
if($query{'isstatic'} eq "yes"){
#If invoked from .htaccess / static url.
$noindex = "";
}
else {
#If invoked directly via mypage.cgi script.
$noindex = qq~
<META NAME="ROBOTS" CONTENT="NOINDEX">
~;
}
print <<META;
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
$noindex
<meta http-equiv="Content-Language" content="en-us">
<title>#*$!</title>
<META NAME="description" CONTENT="#*$!">
<META NAME="keywords" CONTENT="#*$!">
META
...rest of the cgi code
Hope this helps.
Neal
Justin
This is one of the few areas where jdMorgan and I do not totally agree, but I assume he has tested what he is suggesting, as I have tested what I am suggesting, so it may be more a matter of preference than correct v. incorrect. When properly redirected URLs should only have a very short period of time where they drop in rankings.
Right now, it does so. The search engines display the HTM in most cases on a keyword search and when you click on he link it automatically transfers to the .asp pages. The URL still remains the same in the browser after the new page comes up.
When I do Header check, the results say the page is sending a 200 code.
My questions are the following:
Does it need to still send a 301 code. If so how do we change it? Is Google penalizing us for setting up the ISAPI the way it currently is? Must it be changed to produce a 301 redirect, if so how do we do it? Our host set up the ISAPi for us and not sure how they set it up.
As far as functionality it is performing the way it needs to but I am concerned with how Google is viewing it and if it is acceptable by them? The ASP pages are indexed. Our search positioning had dropped before we made the switch but it has not improved since we made the switch about 4 weeks ago.
Is ISAPI a good way to set it up and within ISAPI what is the best way that Google will approve of?
Any help clarifying this would be appreciated.
Thanks.
Steve
[edited by: tedster at 1:10 am (utc) on May 3, 2007]
[edit reason] change to example.com - it will never be owned [/edit]