Welcome to WebmasterWorld Guest from 23.22.46.195

Forum Moderators: Ocean10000 & incrediBILL & phranque

Removing Query String from Destination URL

   
9:34 pm on Sep 17, 2009 (gmt 0)

5+ Year Member



This website I am working on has dynamic URLs rewritten into static ones.

Now, I need to place some 301 redirects between those static pages.

Such as from

http://www.example.com/old-file-name-c-63_75.html

to

http://www.example.com/new-filename-c-63.html

I tried using the following code for the redirect

Redirect 301 /old-file-name-c-63_75.html http://www.example.com/new-filename-c-63.html

But the destination URL now becomes

http://www.example.com/new-filename-c-63.html?cPath=63_75

I knew from reading this forum earlier that to remove the query string appended to the destination URL, we need to attach a ? to the end of the destination URL.

So I tried with this again

Redirect 301 /old-file-name-c-63_75.html http://www.example.com/new-filename-c-63.html?

But this time the destination URL becomes

http://www.example.com/new-filename-c-63.html?

Now how do I remove that trailing ? from the URL?

Kindly advice. Thanks a lot!

9:45 pm on Sep 17, 2009 (gmt 0)

5+ Year Member



For reference, these are the rewrite rules in the HTACCESS

RewriteRule ^(.*)-p-(.*).html$ product_info.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-c-(.*).html$ index.php?cPath=$2&%{QUERY_STRING}
RewriteRule ^(.*)-m-([0-9]+).html$ index.php?manufacturers_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pi-([0-9]+).html$ popup_image.php?pID=$2&%{QUERY_STRING}
RewriteRule ^(.*)-t-([0-9]+).html$ articles.php?tPath=$2&%{QUERY_STRING}
RewriteRule ^(.*)-a-([0-9]+).html$ article_info.php?articles_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pr-([0-9]+).html$ product_reviews.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-pri-([0-9]+).html$ product_reviews_info.php?products_id=$2&%{QUERY_STRING}
RewriteRule ^(.*)-i-([0-9]+).html$ information.php?info_id=$2&%{QUERY_STRING}

2:38 am on Sep 18, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



> I knew from reading this forum earlier that to remove the query string appended to the destination URL, we need to attach a ? to the end of the destination URL.

Yes, but this only works with mod_rewrite, and "Redirect 301" is a mod_alias directive....

I suggest that you re-code your redirects using mod_rewrite rules instead of mod_alias directives (and include the trailing "?").

Your mod_rewrite rules are also very inefficient. For example, if the requested URL-path matching your first rule is "/word-p-word.html", then then re-writing the rule and pattern as


RewriteRule ^([^-]+)-p-([^.]+)\.html$ /product_info.php?products_id=$2 [QSA,L]

and doing this for each of your other rules would speed up your server quite a bit...

Avoid the use of ".*" whenever possible, and try to never use more than one in any given pattern.

The [QSA] flag replaces your 'manual' method of including the original query string. It says, "Append this new Query String to the existing one."

Jim

7:49 am on Sep 18, 2009 (gmt 0)

5+ Year Member



Thanks a lot again Jim :)

I suggest that you re-code your redirects using mod_rewrite rules instead of mod_alias directives (and include the trailing "?").

Can you please give me an example of how to do it?

7:52 am on Sep 18, 2009 (gmt 0)

5+ Year Member



Is this code correct for redirecting the URLs using Mod_Rewrite?

RewriteCond %{THE_REQUEST} ^/old-file_name-c-63_75\.html\ HTTP/
RewriteRule ^old-file_name-c-63_75\.html$ http://www.example.com/new-filename-c-63.html? [R=301,L]

9:52 am on Sep 18, 2009 (gmt 0)

5+ Year Member



Your mod_rewrite rules are also very inefficient. For example, if the requested URL-path matching your first rule is "/word-p-word.html", then then re-writing the rule and pattern as

RewriteRule ^([^-]+)-p-([^.]+)\.html$ /product_info.php?products_id=$2 [QSA,L]

Jim, I changed the rewrite rules as advised by you.

Though pages such as http://www.example.com/page-c-64.html are opening fine but pages such as
http://www.example.com/page-name-c-64.html and http://www.example.com/page-name-name-c-64.html

are showing up a 404 error.

Kindly advise.

2:33 pm on Sep 18, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Is this code correct for redirecting the URLs using Mod_Rewrite?

RewriteCond %{THE_REQUEST} ^/old-file_name-c-63_75\.html\ HTTP/
RewriteRule ^old-file_name-c-63_75\.html$ http://www.example.com/new-filename-c-63.html? [R=301,L]


That would work, but it's not necessary in this case to check THE_REQUEST -- That is required only in the special case where a requests for internal filepaths need to be redirected back to a canonical URL after the URL-to-filename mapping has been modified by another rewriterule... That's not applicable here.

You can simply remove that RewriteCond line, and use your RewriteRule, which is correct.

Jim, I changed the rewrite rules as advised by you.

Though pages such as http://www.example.com/page-c-64.html are opening fine but pages such as
http://www.example.com/page-name-c-64.html and http://www.example.com/page-name-name-c-64.html

are showing up a 404 error.

Yes, of course those URLs fail, because the pattern I provided specifically allows only one hyphen before "c-64.html". You're using a hyphen as a delimiter between the name, the category-indicator, and the category-number 'fields', but also allowing that same delimiter character within the field -- which not a very good approach because it leads to ambiguity about which "word" should go in which field.

However, the problem can be fixed while still providing some performance improvement over the mutliple-".*" pattern by simply allowing for multiple hyphens:


RewriteRule ^[b](([^-]+)+)[/b]-p-([^.]+)\.html$ /product_info.php?products_id=[b]$3[/b] [QSA,L]

And note also that if the "category number" field is always all-numeric, you could also go with:

RewriteRule ^(([^-]+)+)-p-([b][0-9][/b]+)\.html$ /product_info.php?products_id=$3 [QSA,L]

Usually, the more-specific your rules are, the better, from a performance standpoint.

Because you didn't figure out the problem (if not the cure) for yourself, I'd suggest that you study the regular-expressions tutorial cited in our Apache Forum Charter. If you proceed to use mod_rewrite without fully understanding regular expressions and mod_rewrite, you may make a tiny mistake today that will ruin your site's ranking over the next six months, and you won't even be aware of the cause... mod_rewrite affects your server configuration, and so should be used very carefully; You cannot 'guess your way to success' with it. More likely, you'll cause a disaster, because for every million rewrite rules you could create simply by guesswork, only a few will work at all, and even fewer will actually do what you want (and not do anything that you don't want to do).

I need to point out another flaw with the overall 'design' of this rule and with your original rules. If I discover that your site competes with my own, there is nothing to stop me creating thousands of bogus links to your product pages, using URLs like "pro-seo-is-a-thief-and-a-cheater-p-64.html" and "this-product-is-useless-junk-p-64.html", etc., thereby creating not only a bunch of links associating your site with "bad thing" keywords, but also creating a massive duplicate-content problem for your site, since requests for "/<anything-at-all-here>-p-64.html" will be rewritten to your script, and will "work" by producing a product-64 page.

You should pass the first part of the URL-path that precedes "-p-" to your script, and the script should verify it to make sure that it is valid and matches the database entry for the 'page name' for "-p-64". If not, then your script should either return a 404-Not-Found, or it should return a 301 redirect to the correct-page-name URL for that product.

As I noted above, there is far more to successfully using mod_rewrite than just typing a few strange-looking character-sequences in your .htaccess file...

Jim

10:20 pm on Sep 18, 2009 (gmt 0)

5+ Year Member



Thanks a lot Jim.

I also found the problem relating to allowing any character between "/<anything-at-all-here>-p-64.html" and it does show up a valid page.

You should pass the first part of the URL-path that precedes "-p-" to your script, and the script should verify it to make sure that it is valid and matches the database entry for the 'page name' for "-p-64". If not, then your script should either return a 404-Not-Found, or it should return a 301 redirect to the correct-page-name URL for that product.

Any idea on how I can implement it? Can it be done through the .HTACCESS as well?

Thanks again!

11:30 pm on Sep 18, 2009 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Just back-reference "$1" in the rule I posted, and pass it to your script as a variable, such as "/product_info.php?page-name=$1&products_id=$3 [QSA,L]"

Jim

10:32 pm on Sep 20, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Implementation: 90% of the code goes in your PHP script. It looks up the words that should be associated with the number. It then compares what they should be, with what was actually requested. If there is a difference, the script sends a HEADER with 301 and the correct URL so that the browser can make a new HTTP request for the right URL. Only if the requested words successfully match the number, is the actual content served.
6:04 pm on Sep 21, 2009 (gmt 0)

5+ Year Member



Thanks a lot guys :)
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month