Forum Moderators: phranque

Message Too Old, No Replies

Forbidding Spam Query String in Page Request

RewriteCond %{QUERY_STRING} ^.+http:

         

Busynut

8:32 pm on Feb 4, 2008 (gmt 0)

10+ Year Member



Hi all -

In the last few weeks I've noticed a new kind of spamming taking place... rather than 'referer spam' - the spam url is in the page requests. I'm getting dozens of these a day - they are hitting my phpbb forum as well as a cgi download script. Here's a couple examples:

/forum/viewforum.php?f=http%3A%2F%2Fwww.spammer.co.uk%2Fforum%2Flovuqo%2Fzil%2F
Http Code: 200 Date: Feb 04 04:53:48 Http Version: HTTP/1.0 Size in Bytes: 5967
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

/cgi-bin/sample/download_script.pl?http%3A%2F%2Fwww.spammer.com%2Fadmin%2Fcorreo%2Fenaq%2Fecib%2F
Http Code: 200 Date: Feb 04 07:04:32 Http Version: HTTP/1.0 Size in Bytes: 1175
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

These page requests are resulting in Not Found errors and they are leaving dozens of spam urls each visit - I've been looking for a way to block them and havent' yet succeeded.

I tried this and it isn't working for me (still getting the Not Found error rather than Forbidden):

RewriteCond %{QUERY_STRING} ^.+http:
RewriteRule .* - [L,F]

Any advice on how to block these?

Badger37

1:47 pm on Mar 25, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks for all the efforts but I'm afraid this still failed - I got the same results with the last syntax.

This is all possibly related to 'cross scripting' attacks. The attacks seem to disappear for a few weeks and then return. As I've mentioned in a few places here, I can see others are also seeing the same problem but can't see where anyone has got to grips with it. For example I see exactly the same problem as Busynut reported in the first post in this thread.

Think I'll give up for the time being :(

jdMorgan

5:08 pm on Mar 25, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Too bad, as this approach works perfectly on my servers... :(

It may have to do with your meaning of "outgoing links" and how those are handled on your server. I can't tell from what you've posted.

Jim

EarleyGirl

1:58 am on Jun 2, 2008 (gmt 0)

10+ Year Member



I'm having another possible hack attempt. I'm seeing this in my access logs: domain.com/dir/subdir/index.php?start=20

Due to using mod_rewrite for seo, there are no obvious PHP pages in this section. Adding index.php?start=20 at the end of the subdirectory means they are looking for/attempting something.

How can I use .htaccess / mod_rewrite to either redirect to an error page or redirect the URL back to domain.com/dir/subdir/ (or even the home page) if someone else attempts this? I'm already using the following:
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]
Is there a way to also incorporate what I need to the existing code or would it require another RewriteCond/Rule?

These guys really bug me so any help would be most appreciated!

EG

SteveWh

3:31 am on Jun 2, 2008 (gmt 0)

10+ Year Member



EarleyGirl, that's not really worth doing. Looks like you've got a forum, maybe SMF? They're requesting a page that shows the latest 20 posts, or latest 20 threads, or something like that.

There's nothing inherently malicious in those requests. I.e. they're not hack attempts.

When you use mod_rewrite for SEO, you're probably rewriting pages that look like "this-thread-topic.html" to "index.php?topic=NNNN". Remember that your rewritten requests are sent back to your server in the new "index.php" form. If you start banning requests that use the .php form, you'll be banning the legitimately rewritten requests, too.

EarleyGirl

3:49 am on Jun 2, 2008 (gmt 0)

10+ Year Member



Hi SteveWh,
Thanks for your reply. No, we are not using a forum. We have a CMS using PHP with sections of articles. All PHP pages have been rewritten as .html for all sections except the news. He was not in the news section. There would be no need to request what this person has requested and there are no active forums that would give him the idea that he could search for more "posts" that way. All I want to do is let people know that after 10 seconds on the site, if they start requesting pages that are non existent (as he did), that their request will redirect them to an error or back to the home page of the site.

Thanks,
EG

jdMorgan

4:30 am on Jun 2, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The following snippet will redirect any direct client request for /<any-directories>/index.php?start=<numbers> to http://www.example.com/<any-directories>/

That is, it will strip off the "index.php" and the query string, and redirect to what is left. But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\?start=[0-9]+
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]

That is a point fix. In most cases, I recommend doing this kind of redirect for any "index.php" request regardless of any query string, as long as you're sure you never link to /index.php from within your own site (i.e. all your own links are to "/" only). In that case, the code above is reduced to:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]

Jim

[edit] Corrections as noted below. [/edit]

[1][edited by: jdMorgan at 7:46 pm (utc) on June 2, 2008]

SteveWh

5:01 am on Jun 2, 2008 (gmt 0)

10+ Year Member



Nice job, Jim, in about the same amount of time it took me to do a much simpler one. Suspect you've been doing this longer than I have. I'll post the following anyway.

-----

Here's skeleton code for what you want to do, based on your example above. This should be a 2nd RewriteCond/Rule section, in order to keep things modular and not make any one section too complicated.

You can make it more general by using a more generalized regular expression for the part that says "start=20".

#RewriteCond %{REQUEST_URI} ^/dir/subdir/index\.php$ [NC]
RewriteCond %{QUERY_STRING} ^start=20$ [NC]
RewriteRule .* - [F]

#You'd redirect to other pages by using one of the following RewriteRules instead, but consider whether it might be a robot making these requests. If it is, why bother redirecting?

#RewriteRule ^(.*)$ http://example.com/ [R=301,L]
#RewriteRule ^(.*)$ http://example.com/dir/subdir/ [R=301,L]

jdMorgan

5:23 am on Jun 2, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do mind the warning above:

But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.

Any code which does not check the client request header is likely to interfere with the pre-existing rules that rewrite "friendly" URLs to the back-end script. The result will likely be an "infinite" loop of rewrite/redirect...

Jim

EarleyGirl

8:13 am on Jun 2, 2008 (gmt 0)

10+ Year Member



Thank you both so much. I'll give that a try!

EG

EarleyGirl

6:52 pm on Jun 2, 2008 (gmt 0)

10+ Year Member



Thanks. Steve's method worked but Jim's method below allowed me to consolidate my .htaccess a bit as it took care of the index.php issue throughout. It worked when I added a \ after the {3/9}:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]

But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.

I do have a redirect page that auto redirects to index.php but it seems to redirect just fine to / anyway. It is done with PHP and not mod_rewrite, perhaps that is why it appears to work without conflict.

This is what I currently have in .htaccess. I'm wondering if it should be consolidated or if okay as is?

#Remove question mark if blank query string
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]*\?\ HTTP/
RewriteRule (.*) http://www.example.com/$1? [R=301,L]

#Prevent spam http in page request
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]

#Redirect requests for index.php and start numbers
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]

Thanks again,
EG

jdMorgan

7:54 pm on Jun 2, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Good catch on the missing escape for the literal space. I corrected the code posted above to prevent anyone else having to debug it.

Put your three rules in this order:

1) "Prevent spam http in page request"
2) "Redirect requests for index.php and start numbers"
3) "Remove question mark if blank query string"

The reasoning is that rule #1 (after re-ordering) will denies access, so no need to bother redirecting the client.
Rule #2 will remove *any* query string, so there's no need to check it to see if it's blank.
Rule #3 then removes blank query strings only if neither of the previous two rules were applied.

Jim

EarleyGirl

11:44 pm on Jun 2, 2008 (gmt 0)

10+ Year Member



Thanks for helping me understand the order, it's much clearer now. Thanks so much, Jim!

EG

EarleyGirl

6:06 pm on Aug 6, 2008 (gmt 0)

10+ Year Member



I have a new spammer/hacker using one of my URLs as such:
/dir/page.php?pageid=http%3A%2F%2Fwww.example.com%2Fphplib-7.2b%2Fpages%2Fgodot%2Fecemi%2

When I tried it I got a 503 Service temporarily unavailable error page. I assume that is a hack attempt and can't be good for the server, correct? This spammer did it four times and then downloaded a bunch of pages in seconds. I've banned the IP but would like to know if I can prevent this from happening by extending what I already have in my .htaccess or adding a new line. I'm still not quite sure why it wasn't stopped by this:
#Prevent spam http in page request
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]

Hackers and spammers tend to unnerve me so any help would be much appreciated!

jdMorgan

6:27 pm on Aug 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe also check for the "%" which is part of the hex-encoded colon character:

RewriteCond %{QUERY_STRING} https?[:%]
-or-
RewriteCond %{QUERY_STRING} https?(:¦%3A) [NC]

As usual, replace the broken pipe "¦" character with a solid pipe before use.

The 'effect' of this URL injection attempt depends entirely on what your page.php script might do with a URL. If that script is written to disallow references to domains outside your own, then you're fine. If it accepts that URL and 'includes' the file at that URL, you could be in really big trouble.

Jim

EarleyGirl

7:22 am on Aug 7, 2008 (gmt 0)

10+ Year Member



Thanks, Jim! I finally got a chance to try the code and it works. I'll sleep easier tonight. Much appreciated!

too much information

6:38 am on Sep 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've had this same problem lately and didn't think to try this solution, but what I was finding on the other end of the spam domains was often PHP code in the view source of an otherwise very basic and ugly page.

So in my case I was more comfortable with returning a "400 Bad Request" and closing the connection immediately if there is an unusual portion of query string. Just to make sure there is no where for an injection attack to go.

Not to mention that I was getting some legit bots that were starting to try the same query strings as if they were valid pages on my site. The 400 should knock those bogus query strings out of an SE's database pretty fast and prevent them from being passed on.

too much information

7:44 am on Sep 1, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just did a quick check on my logs and I found over 1300 of this type of hit blocked on the most abused page just this week.

One thing I noticed about the site in question recently was that my Google Alert for that domain name has been sending me links to pages that contain my domain in some very strange text blocks that look like some sort of post results but don't look normal. It could be some sort of list of sites that are good targets for this type of abuse.

Since adding my block, I am not getting nearly as many Google Alerts of that type anymore.

Has anyone else set up an alert for their domain name and seen this type of alert show up?

This 47 message thread spans 2 pages: 47