Forum Moderators: phranque

Message Too Old, No Replies

Help with .htaccess

         

Merganser

3:39 pm on Jun 5, 2010 (gmt 0)

10+ Year Member



I have some legitimate pages that use a GET variable. However, every other day something is appending a bogus GET variable like below:

www.mysite.com/search.php?var=http://217.218.225.2:2082/index.html?

where the illegitimate portion of this request is everything that follows the = sign.

I want to simply give an error in this specific instance and not allow access to any page but I can not get my .htaccess to work. Any help is appreciated.

The code I am using is below (only the last segment is related to this discussion but I included my entire file in case some earlier portion is influencing the functionality:
---------------------------

# Prevent this .htaccess file from being viewed
<Files .htaccess>
order allow,deny
deny from all
</Files>

# Prevent Listing of Directory
Options All -Indexes

# Block User by IP
order allow,deny
deny from 92.241.182.0
deny from 91.120.21.93
allow from all

RewriteEngine On

# Block Traffic From Referrers
RewriteCond %{HTTP_REFERER} [www\.iaea\.org$...] [NC]
RewriteRule ^.*$ - [F,L]

# Block Bad Bots
RewriteCond %{HTTP_USER_AGENT} Tasapspider [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Tagoobot [NC]
RewriteRule ^.*$ - [F,L]

# New code to fix bogus GET variable
RewriteCond %{REQUEST_URI} .*217.218.225.2.* [NC]
RewriteRule ^.*$ - [F,L]

g1smd

5:50 pm on Jun 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In the rules above [F] implies [L], so only [F] is required.


From yesterday, [webmasterworld.com...] to get you started.

Merganser

9:13 pm on Jun 6, 2010 (gmt 0)

10+ Year Member



I did not derive much help from the prior thread. I am not wanting to prevent all URL variables, just one in particular at the moment (http://217.218.225.2:2082/index.html?).

I was thinking that my RewriteCond above would match requests with "217.218.225.2" in it and then the RewriteRule would cause the failure. But this does not seem to happen when testing. I am not a pro with regular expressions but I can struggle through it. I feel like I am missing some portion of a larger picture. Should I be using something different than {REQUEST_URI}?

Merganser

10:48 pm on Jun 6, 2010 (gmt 0)

10+ Year Member



OK - I seem to be getting it working by using QUERY_STRING instead of REQUEST_URI. Anyone know where I can find a reference detailing the difference between QUERY_STRING, REQUEST_URI, and THE_REQUEST?

SteveWh

1:45 am on Jun 7, 2010 (gmt 0)

10+ Year Member



Those requests you are blocking are "remote file inclusion" (RFI) attacks, and you are correct to block them. However, I would suggest using a rule that blocks all of them instead of just that one IP address.

Block any request where the query string contains =http:// or =ftp://. First make sure your own pages don't send any requests to any of your other pages using that format. If you find any, you should be able to recode them so they don't; then you can implement the ban.

You can also ban any request where the query string contains a question mark. The question mark that begins the query string in the URL is not considered part of the request OR part of the query string. It's just a delimiter in the incoming request. So you can ban any query string that contains one question mark and still be ok. It won't result in banning all query strings.

In your PHP configuration (php.ini or .htaccess) you might be able to set allow_url_fopen to Off and allow_url_include to Off, which will also help prevent these requests from doing damage. Also set register_globals to Off.


Anyone know where I can find a reference detailing the difference between QUERY_STRING, REQUEST_URI, and THE_REQUEST?


[httpd.apache.org...]

jdMorgan

2:07 pm on Jun 7, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The detailed documentation for these variables isn't easy to find, but you could test them by writing a 'test rule' to see what they all contain...
 RewriteRule ^mydirectory/mypage\.html$ http://www.example.com/?Http_Host=%{HTTP_HOST}&Request_Uri=%{REQUEST_URI}&Query_String=%{QUERY_STRING}&The_Request=%{THE_REQUEST} [R=301,L]


With this rule in place, request the URL-path /mydirectory/mypage.html from your server, then look at the address bar after the redirect.

Doing an HTTP/1.1 GET on a test URL of
http://www.example.com/mydirectory/mypage.html?myqueryparm1=this&myqueryparm2=that#myfragment=shard
you'll get:

%{HTTP:HOST} = www.example.com
%{REQUEST_URI} = /mydirectory/mypage.html
%{QUERY_STRING} = myqueryparm1=this&myqueryparm2=that
%{THE_REQUEST} = GET /mydirectory/mypage.html?myqueryparm1=this&myqueryparm2=that#myfragment=shard HTTP/1.1

The RewriteRule pattern must match "mydirectory/mypage.html" in .htaccess or within a <Directory /> container in a server config file, or it must match "/mydirectory/mypage.html" in a server config file outside of any <Directory> container.

Note that %{REQUEST_URI} and %{QUERY_STRING} can and will be updated by internal rewrites done in the context of this HTTP request. However, %{HTTP_HOST} and %{THE_REQUEST} will not get updated in this manner, since an internal rewrite takes place entirely "inside" this host, and %{THE_REQUEST} is always the original HTTP request line as sent by the client (e.g. browser) and logged in your raw server access log file.

Note that most user-agents do not send the fragment to the server, an exception being some browsers built on Apple's Webkit. This fragment is also called a "named anchor" on an HTML page, and is defined using <a name="shard"> or <div name="shard">

Jim

Merganser

5:19 am on Jun 8, 2010 (gmt 0)

10+ Year Member



I think I will change it some based on both of your comments. I especially like the 'test rule' jd - clever. I am very familiar with similar techniques in PHP, but I did not know how to do it with .htaccess.

I should have enough to run with it. Thanks to you both.