Forum Moderators: phranque

Message Too Old, No Replies

.htaccess - cannot compile regular expression

htaccess server error 500

         

bawhitney

12:12 am on Feb 22, 2008 (gmt 0)

10+ Year Member


I'm moving one of my sites from one server to another. This works fine on the original dedicated server but not on the new shared server.

Here is the .htaccess offending line :
RewriteRule ^subject/([0-9]*)-([a-z,\_,A-Z,\_,\',\.,\-,\&,\%,\$,\!,0-9]*)_([a-z,\_,A-Z,\-,\_,\',\.,\&,\%,\$,\!,0-9]*).html subject_details.php?bid=$1

Here is the error I get in the log
cannot compile regular expression '^subject/([0-9]*)-([a-z,\\_,A-Z,\\_,\\',\\.,\\-,\\&,\\%,\\$,\\!,0-9]*)_([a-z,\\_,A-Z,\\-,\\_,\\',\\.,\\&,\\%,\\$,\\!,0-9]*).html'\n

The error that I get in the browser is :
500 Internal Server Error

I'm pulling my hair out. Can anyone help ?

Thanks

jdMorgan

3:59 am on Feb 22, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The escaping rules within grouped character sets are relaxed -- The only characters that need to be escaped are "-" and "]". Commas are not used as delimiters. You can eliminate 26 unnecessary compare operations by by using the [NC] flag to make the compare case-insensitive. You *do* need to escape the period preceding "html"

RewriteRule ^subject/([0-9]*)-([a-z0-9_'.&%$!\-]*)_([a-z0-9_'.&%$!\-]*)\.html$ subject_details.php?bid=$1 [NC]

Also, be aware that you have included many characters which are not valid in a URL; These characters will be hex-encoded by clients for transmission, and will result in "ugly" URLs in search engine results. You should endeavor to prevent the use of any characters except a-z, A-Z, 0-9, "-", "_", and "." in your URLs. That is,

RewriteRule ^subject/([0-9]*)-([a-z0-9_.\-]*)_([a-z0-9_.\-]*)\.html$ subject_details.php?bid=$1 [NC]

The rules for query strings attached to URLs are relaxed, though.

See RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax [faqs.org] for more information.

It is unclear whether you want all except one "word" separated by underscores to be matched into the second or into the third sub-pattern. As the code is written now, all except the last underscore-separated word will be matched into the second by default. You should modify the sub-patterns to remove this ambiguity, and actually, there is no apparent need for the third sub-pattern with the current grouped characters sets.

In other words, it could just as well be written as


RewriteRule ^subject/([0-9]*)-[a-z0-9_'.&%$!\-]*\.html$ subject_details.php?bid=$1 [NC]

since the second and third sub-patterns are the same, both include the underscore "word-separator" character, and no back-reference to $2 is present in the substitution.

Jim

[edited by: jdMorgan at 4:11 am (utc) on Feb. 22, 2008]