Forum Moderators: phranque

Message Too Old, No Replies

RewriteRule rules in .htaccess

         

smallcompany

8:06 pm on Jan 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In .htaccess I have:

RewriteEngine On
RewriteBase /
RewriteRule page1.html page2.html [R=301,L]
RewriteRule ^(.*)\.html$ parser.cgi?file=$1 [QSA,L]

I wonder about part [R=301,L] and if order of values matter:

is [R=301,L] and [L,R=301] the same?

Furthermore, since I have a parser.cgi that parses each file, should I delete "L" from 301 rule?
I base this question on the fact that "L" means last rule.

Thanks

gergoe

12:59 am on Jan 17, 2008 (gmt 0)

10+ Year Member



No, the order of flags does not matter, they are evaluated in such a way that it makes sense. For example Last is only evaluated (executed), when the processing of the rule has been finished, not before - regardless of its position.

But the R=301 flag means to send back a "301 permanently moved" message to the browser, so from here on, it makes no sense to continue processing more rules, as the response to the browser has been already formed, so the Last rule is more than needed there.

It is an another question that if your intention is not to redirect the browser to page2, but 'silently' do the same (that's, rewrite the request), then you will need to remove both your flags from the first RewriteRule:

  • The Redirect because you only need that if you want 'external redirecting' to happen, so you tell the browser: Stop using page1, use page2 from now on.
  • The Last, because you want the following rule to rewrite the page2 request to pass through your cgi script, with the Last defined, it would not do that.

The one you choose (redirecting or rewriting) depends on what you want to achieve. If you want everyone to stop using page1 (for example because it is about shoes, but you stopped selling them), then the first one (redirecting) is for you (your original rules), but all what you want is to threat both 'files' in the same way (for example page1 is about leather shoes, page2 is about leatherette shoes, and you don't want to maintain two different content for them anymore), then the second one (rewriting) is for you.

Actually the result are the same in both cases (you get page2), but they have different behavior on the long run (for example search engines will stop indexing page1 if you go for the first one, the redirection).

(Sorry about the shoes, did not had a better example now :-)

smallcompany

8:08 am on Jan 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks a lot.

I think I am good to go because I do want to send back 301 for file being rewritten in this way. The page1 from my example does not exist anymore.

smallcompany

5:34 am on Jan 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Just came across issue that puzzles me, a lot.

If I clear my browser’s cache, and try old pages that are supposed to redirect through 301 RewriteRule, browser hangs. IE shows loading bar slowly progressing but simply hangs. If I try to load new page, it hangs too. If I exclude redirect statement from .htaccess and access new page directly, it works fine, and later redirects with no problem. If I empty cache of my browser again, it hangs again.
If I copy same .htaccess during the time my browser is trying to load a page in question, it loads. It is like copying .htaccess over during page load helps redirect to go through.

Is this something that happened to anyone before?

jdMorgan

6:45 am on Jan 19, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Browser hangs

That's a sign that one or more of your rules is redirecting or rewriting a URL to itself -- Or that one rule does a rewrite of A to B, and another rule redirects or rewrites B back to A. Three and more -step interactions are also possible. This kind of problem results in an 'infinite' loop, although usually the browser or the server gives up after awhile and throws an error.

Since you said "redirects" (plural), but only one redirect is shown in your post above, the problem is likely in some rules you have not posted.

If you want some good information to help you find the problem, look at your server error log. You can also look from the client side, using the "Live HTTP Headers" add-on for Firefox/Mozilla browsers.

Jim

smallcompany

12:05 am on Jan 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I reduced the .htaccess to this:

RewriteEngine On
RewriteBase /
RewriteRule old-page.html new-page.html [L,R=301]
ErrorDocument 404 /error-404.php

Firefox says:
The page isn't redirecting properly.
Firefox has detected that the server is redirecting the request for this address in a way that will never complete.

Live HTTP headers shows that old-page gets 301ed to new-page, but then it shows that new-page gets 301ed to itself?!

Then I commented out this particular redirect and tested the other one which is in this form:

RewriteRule subfolder/old-file.html subfolder/new-file.html [L,R=301]

That one worked just fine.

Then I tested the case root > subfolder whihc was in the form of:

RewriteRule old-file.html subfolder/new-file.html [L,R=301]

That one worked fine too.

Why would problem be about new pages in root only? Old pages seem not to matter. Is the syntax wrong?

jdMorgan

12:26 am on Jan 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Using "old-page.html" and "new-page.html" in this thread may be hiding some information that is critical to solving the problem. However, if the new page name has the same "tail" as the old page --for example, if the old page is "page.html" and the new page is named, "new-page.html-- then the fact that your patterns are not anchored will cause requests for any URL which contains "page.html" to be redirected to "new-page.html", including requests for "new-page.html" itself, thus causing a loop.

Furthermore, the "." character is a regular expressions token meaning "any single character." This could also be causing the problem, although it is less likely.

The strict syntax for your problematic rule will include pattern anchoring and escaping of the literal period in the rule pattern:


RewriteRule ^old-page\.html$ http://www.example.com/new-page.html [R=301,L]

Our forum charter [webmasterworld.com] contains a link to a concise regular-expressions tutorial, which may prove helpful.

Jim

smallcompany

1:56 am on Jan 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I thank you very much for your input. Here is the real case of that line:

RewriteRule virus-protection.html antivirus-protection.html [L,R=301]

Based on your last reply, if I want to be sure I do what I want, this would be it:

RewriteRule ^virus-protection.html\.html$ http://www.example.com/antivirus-protection.html [R=301,L]

Right?

If you were me, would you always do the old pages in that fashion of ^filename.html\.html$?

What is \.html$ ensuring?

What would be strict RewriteRule for subfolder and its all content?

Would

^subfolder-old/(.*) http://www.example.com/subfolder-new/$1 [L,R=301]

be strict enough?

Thanks very much. So many times in regards of PHP and Apache I went to their respective sites and it was so hard to figure the rules. Paying the price for own ignorance.

jdMorgan

2:39 am on Jan 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You have an extra ".html" in that line.

RewriteRule ^virus-protection\.html$ http://www.example.com/antivirus-protection.html [R=301,L]

With the concrete examples, it's clear that the problem was as I described it.

For information about "^", "$", and ".", please see the documentation citation.

Your subfolder rule looks OK.

Do not continue to "pay the price" -- Instead, consult the documentation every time you do a new project. If it does not all make sense the first time, it will certainly make more sense after you gain experience -- with both successes and failures.

In this case, the "experience" gained is this: Make your patterns as exact and restrictive as possible, and then "relax" them only if necessary to match all of the requested URLs you must match. It is easy to test and find out that a URL that you need to rewrite is not matched and does not get rewritten. But it is very difficult to test to be sure than no URLs are rewritten that should not be rewritten!

Jim

smallcompany

3:12 am on Jan 20, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It works now!

Thanks very much for help and for opening the door for something I am yet to learn.

Cheers!