Forum Moderators: phranque

Message Too Old, No Replies

Problem with "rsh" string in mod rewrite URL

String containing "rsh" causes problems for rewritten URLs

         

authcode

7:57 am on Jun 24, 2008 (gmt 0)

10+ Year Member



I have what I consider to be a straightforward rewrite rule:
RewriteRule ^p-([0-9]+)/.*\.htm$ index.php?pid=$1 [L]

This rule works as expected ie:
http://www.example.com/p-123456/some+product+name.htm
goes to:
http://www.example.com/index.php?pid=123456

until the string "some+product+name" contains the string "rsh".

For example the URL:
http://www.example.com/p-20371406/rsh1.htm
produces this result:
406 Not Acceptable
An appropriate representation of the requested resource /p-20371406/rsh1.htm could not be found on this server.

I believe this may also happen for the string "ssh", but definitely for "rsh" because Google Webmaster Tools has flagged this problem url. From tests, the "rsh" can appear anywhere in the string and still cause this problem.

I can't find any mention of this problem elsewhere. Does anyone know how to fix this?

Many thanks.

authcode

8:03 am on Jun 24, 2008 (gmt 0)

10+ Year Member



Seems some of my original post is incorrect, sorry.
No problems with "ssh" after a quick test.
Problem only occurs when string begins "rsh".

Receptional Andy

8:07 am on Jun 24, 2008 (gmt 0)



I suspect this might not be a mod_rewrite issue at all (the rule worked for me when I tested). Perhaps it is related to a clash other software on the server.

406 is a strange error to get too, since that implies a problem with the client (i.e. the browser being unable to accept the content provided) rather than something happening on the server. Is there anything interesting in the HTTP headers for the 'rsh' requests?

authcode

8:15 am on Jun 24, 2008 (gmt 0)

10+ Year Member



Is this the information you mean (from Live HTTP headers on Firefox)?

http://www.example.net/p-20371406/rsh1.htm

GET /p-20371406/rsh1.htm HTTP/1.1
Host: www.example.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-gb,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: #*$!=true; pass=#*$!; user=#*$!; PHPSESSID=#*$!
Cache-Control: max-age=0

HTTP/1.x 406 Not Acceptable
Date: Tue, 24 Jun 2008 08:12:07 GMT
Server: Apache
Keep-Alive: timeout=3, max=99
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

(I've replaced some sensitive information).

Receptional Andy

8:25 am on Jun 24, 2008 (gmt 0)



The HTTP status code definition of 406 [w3.org] implies that the error can be caused by the server response being a content type that the browser can't accept. In this case, the server returns text/html, which is in also in the list of content types accepted by the browser.

So, I'm not sure it's a 406 error in the true sense of the word. I'm wondering if it's as a result of mod_security or something like that.

[edited by: Receptional_Andy at 8:26 am (utc) on June 24, 2008]

authcode

8:51 am on Jun 24, 2008 (gmt 0)

10+ Year Member



I don't know anything about RSH but isn't it a means of accessing a server by means of a command prompt type thing? I'm on a shared host so it's likely to be a security issue to prevent people trying to invoke RSH surreptitiously. I'd like to know for sure though, and if there's anything I can do about it before I have to contact the hosting company.

Receptional Andy

8:57 am on Jun 24, 2008 (gmt 0)



Authcode, that's exactly what I suspect - a false positive by server security software (mod_security?), trying to block access to 'remote shell'. In which case it would require server-side configuration, so I think talking to your host is the best plan.

authcode

9:15 am on Jun 24, 2008 (gmt 0)

10+ Year Member



Thanks for your help Andy. I don't think my host would be willing to change their security just for me so I might just ammend my PHP to "fix" strings that begin "rsh".

Interesting issue though - hope this thread is useful to others coming across this problem.

Receptional Andy

9:27 am on Jun 24, 2008 (gmt 0)



I don't think my host would be willing to change their security just for me

It might be worth asking, since fixing it would not make the host less secure - it would just remedy a false positive.

authcode

9:27 am on Jun 24, 2008 (gmt 0)

10+ Year Member



Just to confirm:-
I now place an underscore at the beginning of any string beginning "rsh". Works great.

g1smd

9:53 am on Jun 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I always avoid spaces and underscores in URLs for various reasons.

authcode

10:26 am on Jun 24, 2008 (gmt 0)

10+ Year Member



Good point. Might use a hyphen instead. Cheers.

jdMorgan

6:24 pm on Jun 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Quick test:

Try adding


Options -MultiViews

to your .htaccess or httpd.conf file... This could be content-negotiation grabbing the request before mod_rewrite can do anything with it. If you don't use content negotiation, then turn off MultiViews, as it often causes "unexpected" problems that look like rewrites gone bad.

Jim

authcode

2:38 pm on Jun 29, 2008 (gmt 0)

10+ Year Member



Just to follow up jdMorgan's suggestion...
I added:

Options -MultiViews

to my .htaccess file but this had no effect on the problem.

jdMorgan

3:30 pm on Jun 29, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'll go with the mod_security theory, then. If possible, please do ask your host about this -- Even if they refuse to fix their too-ambiguous "rsh" pattern, it would be useful for our members here to know/confirm the cause of your problem.

[added] The content-type in the server response posted above indicates the content-type of the 406 error page, not that of the originally-requested file/page. So the fact that it was text/html and that text/html is an acceptable content-type for the browser does not really mean anything. [/added]

Thanks,
Jim

[edited by: jdMorgan at 3:32 pm (utc) on June 29, 2008]