Forum Moderators: phranque

Message Too Old, No Replies

Weird problem with simple RewriteRule

         

GreenEyed

9:50 am on Nov 20, 2009 (gmt 0)

10+ Year Member



Hi all,

I'm having a strange problem where some URLs are not being redirected due to a RewriteRule not matching a simple URL.

I can get it to match just by removing a single character... but it's driving me nuts, so before I assume it is a bug in mod_rewrite (it's an old installation we cannot upgrade yet, long story).

The URL I want to redirect are these:
.../ca/infsobre/serveis/oficines/cooperacio/
.../ca/infsobre/serveis/oficines/cooperacio/whatever.html
...

And the "working" non-ideal RewriteRule I have is this one:

RewriteRule (.*)ines/cooperacio/(.*) [newhost.com...] [R,L]

Too wide, I know, but if I add even a single character to the matching expression like:

RewriteRule (.*)cines/cooperacio/(.*) [newhost.com...] [R,L]

Then it doesn't work. I know the best thing would be to match the full URL and I don't like the initial wildcards, but I cannot get it work unless I reduce it to that, which is quite frustrating.

Am I missing something obvious?

PD: Things I have tested:
.- Browser cache is not the problem, I tested it multiple times.
.- I have put the rule at the top of my RewriteRule list, so it shouldn't be another rule messing before it reaches this one.

PPD:
Apache version is Apache 1.3.33 on True64 Unix.

Thanks!

TheMadScientist

1:42 pm on Nov 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi GreenEyed,

Welcome to WebmasterWorld!

Am I missing something obvious?

Nothing I see, but I do suggest [R=301,L] rather than just the [R,L]. What you are doing by only including the R is defining a Redirect, but not the type, so it's undefined, or 302. It's always recommended to define the type, especially if it's permanent (301) and if it's temporary a 307 is correct.

Search engines have had issues in the past deciding how to deal with 302 redirects and it's cost many sites rankings, so it's usually better to not take a chance by leaving it undefined.

If you can't get the match to work, I'd probably think about trying a negative match rather than a catch all at the beginning...

RewriteRule [^/]+/cooperacio/.* http://www.example.com/ [R=301,L]

I removed the parenthesis because there's not a back-reference on the right side of your rule, so there's no need to store the information for later use.

I would very strongly consider redirecting to specific URLs on the new domain, rather than generically to the home page, unless there are not too many pages and/or the root of the new domain is actually the new location of the information.

* Usually when I can't figure out a match it's something goofy...

Maybe try:
RewriteCond %{REQUEST_URI} ^/ca/infsobre/serveis/oficines/cooperacio/
RewriteRule .? http://www.example.com/ [R=301,L]

Just to see if you can get a match that way. It's probably not too much less efficient (if at all) than the catch-all at the beginning. You might also try removing the redirect and echoing out the $_SERVER['REQUEST_URI'] in PHP or another scripting language on one of the pages you are trying to redirect to see if there's something 'goofy' going on. Doing the preceding, you can copy and paste the output into the Rewrite so you are ensured there's not a typo or something else silly you are missing. (Just make sure if you do copy and paste you eliminate any preceding or trailing white-space from the location.)

jdMorgan

3:14 pm on Nov 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This could also be a character-encoding issue. Make sure you're typing these characters into the mod_rewrite code using "US-ASCII" or UTF-8, and not a character-encoding system specific to Catalan... Use a plain-text editor, and not any kind of 'powerful' editing program.

Unfortunately, OS and Apache support for non-English character-sets is 'troublesome' -- to be polite about it.

Jim

g1smd

4:26 pm on Nov 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Beware of using the (.*) pattern, and never use it more than once in a rule. It is horribly inefficient.

GreenEyed

8:09 am on Nov 23, 2009 (gmt 0)

10+ Year Member



Hi again,

Thanks for your suggestions.
First of all: I'm using vi to edit the configuration file and I made sure non-ASCII characters were not an issue as I've been burnt with that before. I should have added it to the initial info, my bad.

And yes, I try to avoid using .* multiple times but it was the only way I could get it to work. I know redirecting everything to the home page is not ideal, but they changed the structure completely and they don't want to bother thinking where to redirect people specifically for each link... :(. Those pages are not under my control, I'm just at the frontend Apache, so I'm just trying to make sure people won't get 404s.

Back to the pattern itself, it's improved a bit, as now I can use:
RewriteRule [^/]+/oficines/cooperacio/.* [whatever.com...] [R=301,L]

But if I try to add any other letter from the URL at the beginning (not even the 's' from 'serveis') , then no match occurs. It's much better as there are certainly no other URLs like that, so the match is wide enough but not too much, and that buys me some time to keep researching. All the logs I have checked show the correct URLs, so nothing seems to be altering it not to match. In any case, we are tying to migrate to a new machine with Apache 2.2, so I'll try to set it up there to rule out a bug in the Apache 1.3 regexp engine that has already been fixed. Else, I can configure the logs to be verbose on that machine. Something I cannot do on the production machine :).

Thanks for your help!
D.

GreenEyed

8:28 am on Nov 23, 2009 (gmt 0)

10+ Year Member



Ummm,
The new version under Apache 2.2 + Linux is matching fine the URL, even including all the directories... so it seems it is some kind of issue either with Apache 1.3 or the True 64 port... another reason to ask them again to migrate ASAP :).

I'm sorry I bothered you with that, I thought I was missing something really obvious as it was such a simple thing...

Anyway, I learnt a few things in the process and thanks again for your comments.

S!
D.

jdMorgan

2:15 pm on Nov 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Wow! -- I've never seen this kind of problem before. Can you please define and describe the "True 64 port" a bit better, so our readers can be sure whether they might be using it or not?

Thanks,
Jim

GreenEyed

10:44 am on Nov 24, 2009 (gmt 0)

10+ Year Member



Hi,

I did not describe it because the OS itself has been discontinued and there is no support so I thought not many people might use it, but you are right, just in case...

This is a 1.3.33 Apache with SSL version compiled from the sources for the Tru64 UNIX V5.1B operating system.

We have been unable to upgrade to version 2.X because even though compilation from sources seems to work, the binary dies after a second or two with no messages or logs whatsoever: the process "simply" vanishes. As the system guys did not want to do much research and there is no support for the OS, we didn't investigate further.

I could not find any similar report of problems with such simple RewriteRule directives, so I asume it's some kind of glitch with this about-to-die OS.

If you are one of those unfortunate enough to have to work with such an outdated OS, I can only recommend to migrate as soon as you are allowed to :).

S!