Forum Moderators: phranque

Message Too Old, No Replies

301 Issues after mod_rewrite

trying to simplify a large number of redirects

         

uprightcomm

3:25 pm on Dec 14, 2005 (gmt 0)

10+ Year Member



I advised my client to do a mod_rewrite to "fix" their dynamic URLs. This results in 1000's of indexed URLs that now need to be redirected to the new URLs. Due to the large number of URLs, this is going to be a big project, but we're trying to simplify and I have a few questions. Any help is greatly appreciated.

Client provided a list of all of old URLs, as well as a new URL that it should be redirected to, then ran into the following problems:

1. That original list of old URLs included ONLY category and subcategory links on the current example.com site, and only in their original form.

For example, for PRODUCT A subcategory, they provided a redirect link to the new site for this URL: http://www.example.com/ab/subcat.jsp?category_key=-123

However, they did not provide a redirect for the link http://www.example.com/ab/subcat.jsp?cursor=8&category_key=-123, that has the "cursor" parameter embedded in the URL. Client asked "how would it even be possible to identify all of the different combinations of parameters embedded in these URLS? (these parameters tell the server how to display the page, such as breadcrumbs, etc.)"

My answer was to use the RedirectMatch directive:
RedirectMatch (.*)\?category_key=-123$ http://www.example.com$1category_key=-123

Am I correct?

2. The URL redirect list they provided also did not define product URL redirects. Client has thousands of product URLs indexed in Google and Yahoo. This task would be impossible to do manually because of the fact that they have thousands of products. These product URLs also may contain different parameters. Here is an example of the same product page, with a different URL format:

http://www.example.com/ab/abc.jsp?prod_key=12345&category_key=%2FProductHierarchy%2FElectronics%2FAudio+%26+Video
http://www.example.com/ab/abc.jsp?prod_oid=7654321&showarrow=y&category_key=-13578&cursor=2&k=WORLDLX5

My suggestion would be to redirect both old URLs to the same new URL, which would solve the problem of the duplicate content on separate URLs. Is this correct?

In regards to the duplicate content issue, the client has different versions for all of pages based on where that link is located on the site, or what they want to display. Different parameters in the URL and breadcrumb displays are essentially the only distinction between one page to another, even though all the content on the page is identical. However it appears that they have not had problems with the current site because they are indexed so well on both Google and Yahoo. I've advised the client that we have suggested the mod_rewrite because it is a sort of SEO "best practice". The site has good content and a very strong link popularity, which may be carrying it to the top of the results, despite the duplicate content among different URLs and the dynamic URLs with multiple parameters that the search engines don't typically "like". In my professional opinion, it is best to implement the mod_rewrite and 301 all old URLs to the new "clean" ones. Any comments?

3. There are several categories or directories within this site. All URLs from within each old category should redirect to the new category. Is the RedirectMatch directive the solution for this issue if I want to redirect http://www.example.com/productaold/abc.jsp?prod_key=12345&category_key=12345
to
http://www.example.com/productanew/newpage.htm
I'm trying to find a shortcut that will allow them to redirect all of the URLs within productaold/ to productanew/. I understand that when you do a normal 301 from an old directory to a new one, it assumes that if you want to go to http://www.example.com/productaold/page.htm you would go to http://www.example.com/productanew/page.htm

But, if page.htm does not exist in the new directory, where does it go if you use the 301? Would the RedirectMatch be the solution?

Thank you again for any help provided.

[edited by: jdMorgan at 4:12 pm (utc) on Dec. 14, 2005]
[edit reason] Example.com [/edit]

jdMorgan

4:20 pm on Dec 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



uprightcomm,

Welcome to WebmasterWorld!

RedirectMatch isn't going to work if there are query strings in the URLs that you want to redirect, because the Redirect family of mod_alias directives doesn't see anything but the URL, and query strings are not considered to be included in the URL -- Rather, they are data appended to a URL to be passed to the resource specified by that URL. For this reason, they are not visible to the Redirect directives, and are handled separately in mod_rewrite as well.

I suggest the following high-level plan:

Use mod_rewrite, with RewriteCond checking the %{QUERY_STRING} variable to:

1) Detect and remove "junk" parameters such as "cursor=8"
2) Standardize the order of the remaining (necessary) parameters

You can also use "RewriteCond %{REQUEST_FILENAME} -f" to check for the existence of a page before you redirect to it (do this only when absolutely required, as it involves a filesystem check and can impact server perfomance). If the check fails, you can then rewrite or redirect to a product category or some kind of 'backup' page.

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Dynamic sites are almost always nasty to clean up, and you should be careful not to under-bid this work.

Jim