homepage Welcome to WebmasterWorld Guest from 184.72.82.126
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
RewriteRule failure on new server
Mokita




msg:4280043
 4:51 am on Mar 11, 2011 (gmt 0)

Recently we moved all our sites to a new server, which is running Apache 2.2, from one which was running 1.3.

Most of my RewriteRules that should result in a [F] are working properly as before, but four of them aren't. They are generating a loop, and eventually fail with this "default" message:

Forbidden

You don't have permission to access /example/ on this server.

Additionally, a 500 Internal Server Error error was encountered while trying to use an ErrorDocument to handle the request.


Normally a 403 error will present our custom 403 error document, not the default.

One of the problematic rules is this one:

RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteCond %{HTTP_USER_AGENT} !Windows\ NT\ (4\.0|5\.[0-2]|6\.[0-1])(\)|;\ [^)])
RewriteCond %{HTTP_USER_AGENT} Windows\ NT
RewriteRule .* - [F]

An example of a malformed user-agent that gets caught by this rule is:

Mozilla/4.0 (compatible; 6.0; Windows NT

I have looked at the code until I go cross-eyed, but can't see anything wrong with it. It was originally written by jdMorgan though, and I have to admit that my scant knowledge of regex means I don't really understand the very last part of the second RewriteCond.

Some help would be greatly appreciated.
TIA.

 

wilderness




msg:4280069
 5:48 am on Mar 11, 2011 (gmt 0)

don't really understand the very last part of the second RewriteCond


It brief it means that a semi-colon, with a trailing space, may NOT be followed by closing-parentheses-character.

For a solution, you might "try" escaping the closing parentheses anchor. No guarantee.

Mokita




msg:4280076
 6:14 am on Mar 11, 2011 (gmt 0)

Hi wilderness

Thanks for your explanation.

And yes I did try this:

RewriteCond %{HTTP_USER_AGENT} !Windows\ NT\ (4\.0|5\.[0-2]|6\.[0-1])(\)|;\ [^\)])

... and still got a 500 error.

wilderness




msg:4280078
 6:22 am on Mar 11, 2011 (gmt 0)

And yes I did try this:


Anchor!

[^)]\)

Mokita




msg:4280079
 6:23 am on Mar 11, 2011 (gmt 0)

I tried putting in a support request with my host, and he replicated the error, and then wrote this:

Your custom error documents for 403 are working (which can be seen if you generate just a pure 403 error), but in this case the 500 internal server error is taking precedence over this incidental 403 error.

Here are the details from our Apache error log regarding the 500 internal server error:

[Thu Mar 10 13:33:22 2011] [error] [client 220.255.nn.nnn] Request exceeded the limit of 10 internal redirects due to probable configuration error. Use 'LimitInternalRecursion' to increase the limit if necessary. Use 'LogLevel debug' to get a backtrace.


But he hasn't offered to do a backtrace.

Mokita




msg:4280080
 6:29 am on Mar 11, 2011 (gmt 0)

Anchor!

[^)]\)


Oops! I guess I thought that you never need to escape the anchor. Wouldn't that negate it?

Anyway, tried it now - still no luck :(

wilderness




msg:4280082
 6:34 am on Mar 11, 2011 (gmt 0)

Mokita,
I've had similar errors with the closing parentheses on multiple occasions that affected multiple lines in my htaccess.

One occasion was the Apache change from 1.x to 2.x.
Others were simply when updates were added.
In many instances I was able to resolved the issue after adding the otherwise unnecessary escape.

Most mass hosting sellers don't have a clue to the perils of htaccess and/or regex.
I had a host for about seven years with two guys that were sharp as knives and even they didn't grasp htaccess.

wilderness




msg:4280085
 6:36 am on Mar 11, 2011 (gmt 0)

still no luck


Worth a try, looks like you'll need to wait on Jim.

Mokita




msg:4280091
 6:54 am on Mar 11, 2011 (gmt 0)

Thanks for trying to assist Don - much appreciated! :)

jdMorgan




msg:4283311
 9:31 pm on Mar 17, 2011 (gmt 0)

You have two problems, the first being you failed to exclude your custom error document from being Forbidden, so any 403 becomes recursive and therefore, you're getting a 500-Server error.

Either exclude your ErrorDocuments as you have excluded robots.txt, or add a skip rule to the very beginning of your rules so that requests for robots.txt and any custom error document are never rewritten or redirected... something like

# Skip all rules for robots.txt and custom error document requests
RewriteRule ^(robots\.txt|403-error-page\.html|500-error-page\.html)$ - [L]

or just

# Skip all rules for robots.txt and custom error document requests
RewriteRule ^(robots\.txt|[0-9]{3}-error-page\.html)$ - [L]

These are just examples, and will need to be adjusted depending on how you name your custom error pages.

The original RewriteCond pattern was correct; neither ")" should be escaped. The difference between servers is likely the usage or configuration of your ErrorDocument directives.

Be *very* sure that the ErrorDocument directives specify only a local filepath; If a full URL is used, the status code returned to clients will always be a 302 redirect, and this will likely make a BIG mess of your search rankings.

Correct:
ErrorDocument 403 /errors/403-error.html
ErrorDocument 404 /errors/404-error.php
ErrorDocument 410 /errors/410-error.php

Dangerously wrong:
ErrorDocument 403 http://example.com/errors/403-error.html
ErrorDocument 404 403 http://example.com/errors/404-error.php
ErrorDocument 410 403 http://example.com/errors/410-error.php

I strongly recommend *not* using a custom error document for 500-Server errors, or that if one is used, that it be a purely-static HTML page. Otherwise, you can end up with a recursive error if any error is made in either the server configuration, the script used to handle errors, or the script interpreter itself or or its configuration... basically, any of these errors just causes more 500 errors, and an 'infinite' loop. Makes finding the real problem just that much more difficult.

BTW, if you do not have direct access to your own error log file, then it is either time to find a new host or time to stop using any .htaccess code or scripts. As it is (with your host claiming that the error log is 'theirs') you are basically taking a big risk every time you tweak your .htaccess file or upgrade a script. No serious troubleshooting is possible without error log file access, and most modern hosts provide this capability to their customers... Personally, I will not become involved with any projects where the error log is not directly and immediately available...

Jim

Mokita




msg:4286188
 10:18 am on Mar 23, 2011 (gmt 0)

Hi Jim,

Sorry about the delayed reply, I've been away.

Many thanks - you've solved it for me. The misbehaving rewrite rules did not have ErrorDocuments excluded.

Those rules worked for years on the old server, so I didn't realise they needed to be.

Also, I don't use a custom 500 error doc, and the paths to the 4xx ones were exactly as you say they should be.

With regard to accessing the Error Log file - we have cPanel which does show most errors for a short while. But the Log details I quoted above, came from the full actual (shared) server log that is only accessible to our host. I don't think there is any Host that allows access to those.

And interestingly the cPanel Error Log, did not show anything at all when the looping 403 turned into a 500 error. That was another thing that added to my confusion.

Thanks again for your help - much appreciated!

jdMorgan




msg:4288781
 11:33 pm on Mar 28, 2011 (gmt 0)

> I don't think there is any Host that allows access to those.

Many hosts provide raw error logs. They simply 'sort' the entries in the common error log into separate files, one for each virtual-host user. This is usually done more-or-less in real-time, or on a schedule, like every ten minutes. It's not particularly challenging, but some hosts just don't bother.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved