Forum Moderators: phranque

Message Too Old, No Replies

500 Error - htaccess regular expression wrong?

Cannot Compile Regular Expression 500 Error - Help with valid regex

         

bradyjoseph

1:12 am on Dec 24, 2011 (gmt 0)

10+ Year Member



Can someone help? I've reviewed everything online, but I think I'm just not familiar enough with regular expressions in htaccess. This same code works fine on my server, but I'm moving to a new server and I'm receiving a 500 Error.

The error logs show:

[24/Dec/2011:01:10:58 +0000] [alert] [client 98.211.45.225] /usr/home/xyz/www/htdocs/.htaccess: RewriteRule: cannot compile regular expression '^([a-zA-Z0-9_\\-/]+)\\/([a-zA-Z0-9_\\-/]+)\\/([a-zA-Z0-9_\\-/]+)$'\n
[24/Dec/2011:01:10:58 +0000] [alert] [client 98.211.45.225] /usr/home/xyz/www/htdocs/.htaccess: RewriteRule: cannot compile regular expression '^([a-zA-Z0-9_\\-/]+)\\/([a-zA-Z0-9_\\-/]+)\\/([a-zA-Z0-9_\\-/]+)$'\n


Here's my .htaccess code:

RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_\-/]+)\/([a-zA-Z0-9_\-/]+)\/([a-zA-Z0-9_\-/]+)$ index.php?page=$1&action=$2&id=$3

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_\-/]+)\/([a-zA-Z0-9_\-/]+)$ index.php?page=$1&action=$2

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?page=$1

matrix_jan

2:22 am on Dec 24, 2011 (gmt 0)

10+ Year Member



Some servers require the usage of RewriteBase.

Try declaring it after the RewriteEngine.

RewriteEngine On
RewriteBase /

lucy24

3:14 am on Dec 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You've got a lot of superfluous backslashes there. In htaccess, you don't need to escape the / character. (Are you coming from a javascript or similar background?)

:: poring over examples ::

Are the error messages and the examples both direct cut-and-paste? Something wonky seems to have happened in transit, notably the added \n line breaks and the \ turning into \\.

It may be something as simple as deleting all backslashes from your RegEx, since none of them are needed.

Anyway, you can't possibly mean

([a-zA-Z0-9_-/]+)/

since that's the htaccess equivalent of [^.]+ and will result in massive backpadaling as described eloquently by g1smd in approximately every third thread in this forum. You probably mean [^/]+ to capture exactly one directory at a time.

\- in groups is often correct-- it means the literal hyphen character-- but your system apparently dislikes it. The safest place to put a literal hyphen is at the very beginning of the group. Second-safest place is at the very end. Don't put it between two other characters-- especially if the second character really comes before the first one, as with _ (or \) and /. That makes for a very unhappy RegEx.

Edit: btw, if your php script is set up to handle null query strings, you can easily consolidate your first two Rules. One of the few things that Regular Expressions don't mind at all is an empty capture, like (/[^/]+)?

g1smd

8:10 am on Dec 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Having a slash in your character groups as well as outside it is not a good idea. The parser will have no idea which slash to break on.

Given the pattern:
^([a-z/]+)/([a-z/]+)/([a-z/]+)$

and the request:
example.com/a/b/c/d/e/f/g

would you want
($1) / ($2) / ($3)
to be:
(a/b/c) / (d/e/f) / (g)
or would you want
(a) / (b) / (c/d/e/f/g)

or perhaps
(a/b) / (c/d) / (e/f/g)
or what?

The parser cannot possibly know with such an ambiguous pattern:

"some characters including optional slashes in $1", "break at the slash" (which one?), "followed by some characters including optional slashes in $2", "break at the slash" (which one?), "followed by some characters including optional slashes in $3".

You're lucky you got error 500. The other option was server meltdown.

Hyphen and slash do not need to be escaped. List the hyphen last in your character groups.

[edited by: tedster at 6:58 pm (utc) on Dec 24, 2011]
[edit reason] member requested fix [/edit]

lucy24

10:53 am on Dec 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



would you want ($1) / ($2) / ($3) to be:
(a/b/c) / (d/e/f) / (g) or would you want (a) / (b) / (c/d/e/f/g)
or perhaps (a/b) / (c/d) / (e/f/g) or what?

Doesn't matter what you want: the greedy-natured RegEx will come back with (a/b/c/d) / (e) / (f) / (g) every time.

I don't think that's what made the server so unhappy. I think it's probably the groups containing \-/ i.e. "Any character in the range 5C-2F". That's the equivalent of [9-0].

[edited by: tedster at 7:00 pm (utc) on Dec 24, 2011]

bradyjoseph

5:25 pm on Dec 24, 2011 (gmt 0)

10+ Year Member



Thank you all for the replies. I've removed all the extra escapes. It's now no longer giving me a 500 Error, so we are making some progress... but, it's just not really doing anything. I think this server is a little screwy too, because the entire site worked on my other server. :-(

I *think* we've resolved the htaccess issues. Now I have to figure out the others. :-)

Here's my code now:


RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2&id=$3

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?page=$1


I'd love to learn more about consolidating the two Rewrite conditions, as Lucy mentioned. I'll definitely have to study htaccess a little more.

bradyjoseph

5:43 pm on Dec 24, 2011 (gmt 0)

10+ Year Member



Okay, so maybe I spoke too soon. Here is the new error, which still seems to be related to htaccess:


[24/Dec/2011:17:39:39 +0000] [error] [client 98.211.45.225] mod_rewrite: maximum number of internal redirects reached. Assuming configuration error. Use 'RewriteOptions MaxRedirects' to increase the limit if neccessary.



RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2&id=$3

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?page=$1

g1smd

6:05 pm on Dec 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Add the [L] flag to every rule.

What else is in the .htaccess file?

Are there any RewriteRules configured as redirects?

bradyjoseph

6:09 pm on Dec 24, 2011 (gmt 0)

10+ Year Member



Thanks for the advice, g1. I added the [L] flag; however, I'm still receiving the same error message in the logs.


[24/Dec/2011:17:39:39 +0000] [error] [client 98.211.45.225] mod_rewrite: maximum number of internal redirects reached. Assuming configuration error. Use 'RewriteOptions MaxRedirects' to increase the limit if neccessary.



RewriteEngine On

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2&id=$3 [L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2 [L]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?page=$1 [L]

bradyjoseph

6:09 pm on Dec 24, 2011 (gmt 0)

10+ Year Member



There's nothing else in the .htaccess file. The above quoted lines are the only thing in the file.

lucy24

11:33 pm on Dec 24, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, time for closer examination.
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?page=$1&action=$2&id=$3 [L]

The two Conditions mean: execute this Rule only if the request is not for an existing file or directory. In fact the !-d isn't necessary. Since the Rule excludes final / (slash), it never can be an existing directory.

The Rule means: If the request is for a "naked" URL (all groups exclude . character) nested exactly two subdirectories deep, like
www.example.com/aaa/bbb/ccc
then serve content from
index.php
converting the three "chunks" of the original request into query parameters.

The two red flags are htaccess + new server. This suggests something different in the new server's config file. That of course is assuming that the error message-- which looks amazingly familiar ;) --is in response to you testing the redirect by requesting those specific files.

Simple experiment. No conditions, just

RewriteRule foobar\.html /RewrittenUrl.html [R=301,L]

What should happen is that you see the 404 page-- your own, if you've got one-- while your browser's address bar says

http://www.example.com/RewrittenUrl.html

If something else happens, it is time for deeper investigation.

:: wait ::
:: stop ::
:: rewind ::

There's nothing else in the .htaccess file. The above quoted lines are the only thing in the file.

Did you mean that literally? Absolutely nothing whatsoever, including the necessary

RewriteEngine On

line? Or did you mean no other rewrites?