Forum Moderators: phranque

Message Too Old, No Replies

Forbidding Spam Query String in Page Request

RewriteCond %{QUERY_STRING} ^.+http:

         

Busynut

8:32 pm on Feb 4, 2008 (gmt 0)

10+ Year Member



Hi all -

In the last few weeks I've noticed a new kind of spamming taking place... rather than 'referer spam' - the spam url is in the page requests. I'm getting dozens of these a day - they are hitting my phpbb forum as well as a cgi download script. Here's a couple examples:

/forum/viewforum.php?f=http%3A%2F%2Fwww.spammer.co.uk%2Fforum%2Flovuqo%2Fzil%2F
Http Code: 200 Date: Feb 04 04:53:48 Http Version: HTTP/1.0 Size in Bytes: 5967
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

/cgi-bin/sample/download_script.pl?http%3A%2F%2Fwww.spammer.com%2Fadmin%2Fcorreo%2Fenaq%2Fecib%2F
Http Code: 200 Date: Feb 04 07:04:32 Http Version: HTTP/1.0 Size in Bytes: 1175
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

These page requests are resulting in Not Found errors and they are leaving dozens of spam urls each visit - I've been looking for a way to block them and havent' yet succeeded.

I tried this and it isn't working for me (still getting the Not Found error rather than Forbidden):

RewriteCond %{QUERY_STRING} ^.+http:
RewriteRule .* - [L,F]

Any advice on how to block these?

jdMorgan

12:53 am on Feb 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Remove the ".+" and don't start-anchor the pattern. You might want to allow for https as well:

RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]

You might also want to use a more-specific RewriteRule pattern, such as "\.(cgi¦php¦pl)$" -- replacing the broken pipe "¦" characters with solid pipes before use.

I've seen a ton of these as well recently, all "attacking" script-type-URLs -- And all 403'ed on my servers. :)

Jim

[edited by: jdMorgan at 12:54 am (utc) on Feb. 5, 2008]

Busynut

11:08 pm on Feb 5, 2008 (gmt 0)

10+ Year Member



Many thanks, Jim, for the response. I changed the coding as you suggested -- I'm still getting directed to the respective "not found" pages for each script, i.e. for the phpbb forum I'm getting redirected to their "The forum you selected does not exist" error message. My cgi script has a similar error message: "The URL that was given to this script does not exist in our database." I would have thought the htaccess would take precedence over any redirection within a script.

jdMorgan

11:01 pm on Feb 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You'll have to 'dig in' to the exact method used to 'map' requests to your .php and .cgi scripts. If that method is implemented using an Apache module which executes before mod_rewrite, then you'll have to put the mod_rewrite code into the script subdirectory itself to make sure it gets executed.

Alternately, modify the scripts to specify what they will *accept* -- Do not implement security checking based on what should be *rejected* as this leaves potentially-gaping security holes. It is much easier to predict and enumerate what should be accepted instead.

Jim

Busynut

11:51 pm on Feb 7, 2008 (gmt 0)

10+ Year Member



Yes, excellent advice as always Jim and thank you. I am digging in to this further - taking a very close look at my scripts.

SteveWh

12:24 am on Feb 8, 2008 (gmt 0)

10+ Year Member



There's more than one way to do almost anything in htaccess, but however you do it, I believe you need to escape the colon. For example, in your original attempt:

RewriteCond %{QUERY_STRING} ^.+http\:
RewriteRule .* - [L,F]

[edited by: SteveWh at 12:25 am (utc) on Feb. 8, 2008]

g1smd

11:29 pm on Feb 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



One thing, doesn't [F] always imply [L] anyway?

jdMorgan

12:37 am on Feb 9, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No need to escape ":" or "/" in mod_rewrite; These are not special characters as they are in PERL, etc.
And yes, [F] implies [L], as documented.

Jim

EarleyGirl

12:22 am on Feb 22, 2008 (gmt 0)

10+ Year Member



JD - you saved the day! Viewing my access logs, I've been finding all sorts of attempts to add an URL to some really sleezy sites onto the end of index.php? pages. Before I added the lines to .htaccess, the site would redirect back to the index.php? page but I don't know what it was actually doing to the system (if it taxes it somehow, or really does what the hacker wants but I can't tell, etc.). Now, I get a forbidden page - exactly what I wanted. Thanks so much!

Badger37

3:10 pm on Mar 5, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



I've been having this same problem!
See this thread: [webmasterworld.com...]

I'm no .htaccess expert(!) does the snipit of code pasted here work?

Thanks.

jdMorgan

2:45 pm on Mar 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are really two different problems being discussed: Log-file spammers using either the HTTP_REFERER header or a query string attached to the requested URL-path to inject their URL into your access logs. The query string method might also be used to inject bogus data into your scripts, and so is slightly more dangerous.

You may want to have code that checks for each of these methods.

Jim

Badger37

10:02 am on Mar 7, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,
I'm seeing the same problem as Busynut reported here.

i.e. The tracking (AXS) I use for my outgoing links is being attacked with spammers URLs being added - not from the 'referrer'.

It looks like lots of others are also seeing the same attack. The URLs change but searching on this one snippet %2Fadmin%2Fcorreo%2Fenaq% in Google shows many messed up logs and some discussion - including this thread. This is just one example there are lots more similar things going on.

The link ends up as a 404 so doesn't seem to accomplish much? As I mentioned I'm no .htaccess expert (but I'm good as copy & paste!) :)

Would the .htaccess code mentioned here help? Turn the 404 to forbidden, or any way to only allow my outgoing links to be followed?

Thanks in advance for any suggestions.

jdMorgan

3:02 pm on Mar 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> i.e. The tracking (AXS) I use for my outgoing links is being attacked with spammers URLs being added - not from the 'referrer'.

That doesn't mean anything to those of us unfamiliar with AXS. You might consider showing an example.com URL with and without the spammer-added string, in the interest of clarity.

> The link ends up as a 404 so doesn't seem to accomplish much?

If the spammer's goal is simply to leave a link in as many log files as possible, then they don't care what the server response code is. Once the link is dropped into your log file, they are happy.

This is because there are many hosting companies and Webmasters who foolishly allow log files and server stats to be publicly-accessible (without any password protection) and therefore crawlable by the search engines. Because of this, there's a fairly big industry growing up around log-spamming for hire, to get inbound links to low-quality Web sites.

If everyone would password-protect their log and stats files, this problem would simply go away.

Jim

[edited by: jdMorgan at 3:03 pm (utc) on Mar. 7, 2008]

Badger37

10:35 am on Mar 8, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



I use a script called AXS (quite a popular, but old CGI tracking script. A Google on AXS will tell you more) - but as the previous posts here show the problem isn't due to this script and affects lots of different setups.

This is the simple code:
<a href="/cgi-bin/axs/ax.pl?http://www.myoutgoinglink.com">

Don't worry my logs aren't available to search engines or the public!

The good thing is that after a day or so of 'attacks' the providers or zombie PCs seem to get closed down (or they move on) before coming back a few weeks later.

I can find lots of this going on (as some people leave their logs open) so I can see it's quite a wide spread problem - I was just looking for some help/tips...

If things get out of control I'll stop tracking out-going links but that would be a shame as it's useful stats.

jdMorgan

3:34 pm on Mar 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OK, we have half an example, then:
This is the simple code:

<a href="/cgi-bin/axs/ax.pl?http://www.myoutgoinglink.com">

So the spammer link would be like this?


<a href="/cgi-bin/axs/ax.pl?http://www.[b]spammy[/b]-outgoinglink.com">

In this case, it's a bit hard to "grab hold of" something you can use to determine whether the outgoing link is spammy or legitimate. If your list of outgoing links is short, then you can compare the outgoing link to those that you expect, and reject those that are spammy.

Alternately, there may be something about the spammy domains that you can use; for example, if they contain a certain group of keywords, for example, "casino," "poker," or "credit."

For the first case, something like this should work:


RewriteCond %{QUERY_STRING} !^http://allowed-outgoing-link1
RewriteCond %{QUERY_STRING} !^http://allowed-outgoing-link2
RewriteCond %{QUERY_STRING} !^http://allowed-outgoing-link3
RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.[b]pl?[/b] [R=301,L]

while for the second, you might use:

RewriteCond %{QUERY_STRING} ^http://.*(casino¦poker¦credit)
RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.[b]pl?[/b] [R=301,L]

In either case, the RewriteRule removes the "bad" query string.

Replace the broken pipe "¦" characters above with solid pipes before use; Posting on this forum modifies the pipe characters.

Jim

Badger37

5:07 pm on Mar 8, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks for that Jim.

Sorry for the 'half' reply - I didn't see that.

http://www.bbc.co.uk/weather/5day.shtml?id=http%3A%2F%2Fwww.spammmmy.com%2Far%2Farticles%2Fjed%2Fumut%2F&links

Above is a spam version - they've used one of my links to the BBC weather, hope that's not too specific for WebmasterWorld... ;)

Where you say: In either case, the RewriteRule removes the "bad" query string.

What will the change actually do? Stop the attack adding the link or change 404?

Thanks again.

[edited by: jdMorgan at 5:27 pm (utc) on Mar. 8, 2008]
[edit reason] de-linked for the BBC's sake. [/edit]

jdMorgan

5:27 pm on Mar 8, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It will remove the query string, turning that link into a link back to your own site. So, if it gets indexed, it provides no benefit to the spammer.

You can do whatever you like with the spammer's requests -- I was concentrating of detection rather than disposition in the examples above.

Jim

Badger37

5:34 pm on Mar 8, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks for all the help...

Badger37

3:08 pm on Mar 16, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Back again - the attacks went away, but have now returned to I'm having a go with my .htacess

I've found that all the dodgy links are formatted with: http%3A%2Fwww

Can I somehow use your second example to pick these rather than using the poker keywords?

Sorry to be so dim and thanks again :)

RewriteCond %{QUERY_STRING} ^http%3A%2Fwww.*

(RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.pl? [R=301,L]

[edited by: Badger37 at 3:10 pm (utc) on Mar. 16, 2008]

Badger37

5:15 pm on Mar 16, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Even though I'm not savvy to this .htaccess code I've added it to one of my sites!?!

The external links all seem to work so I can't have completely trashed things :)

I would appreciate it if anyone can tell me if it's a valid entry when they have a minute.

RewriteCond %{QUERY_STRING} ^http%3A%2Fwww.*
RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.pl? [R=301,L]

jdMorgan

6:36 pm on Mar 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You'll need to escape those "%" characters, and note that the URL-format is incorrect, since the hex-encoded sequence would decode as "http:/www" (only one slash, so invalid). As a result, this code won't catch a properly-formatted URL. In addition your example shows the query string starting with "id=" and your pattern won't match that. Fixing all these issues:

RewriteCond %{QUERY_STRING} http\%3A(\%2F)+www\.
RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.pl? [R=301,L]

You may find it still won't work, because the QUERY_STRING variable may get decoded before it is tested, and therefore not match the pattern. In that case, you'd be looking for:

RewriteCond %{QUERY_STRING} http:/+www\.

And if all else fails, you could examine THE_REQUEST, to look at the un-decoded, original client request line.

Jim

[edit] Corrections as noted below. [/edit]

[edited by: jdMorgan at 7:41 pm (utc) on Mar. 23, 2008]

incrediBILL

10:25 pm on Mar 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Be careful with those "spam" URL's you see because there could be a malicious script sitting at that address. The technique you're seeing is commonly used by botnets to infect all sorts of things and they've been getting more aggressive lately randomly testing URI's on all types of sites looking for any vulnerability.

Blocking any request with the HTTP:// in the QUERY_STRING will most likely save your bacon if you have any open source software installed like WordPress as this is how they upload their scripts to the server when they find a vulnerability.

Badger37

10:36 am on Mar 17, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thanks for the update.

Looking at the attacks, not all hijacked URLs have ID= in them. But they do all seem to have http%3A%2F%2Fwww either at the start of the address of tagged on to the back of a real out-going link.

I've updated my .htaccess to be like this:
RewriteCond %{QUERY_STRING} =http\%3A(\%2F)+www\.
RewriteRule ^cgi-bin/axs/ax\.pl$ http://www.example.com/cgi-bin/axs/ax.pl? [R=301,L]

Now when I follow a hijacked link in my logs I get a 404 from my site :)

But for the links that have the spammy link tacked on behind a real link (which is the majority) this doesn't have any affect and following the link would still take you the real site with the rubbish at the end on the URL.

NB. I tried your other expample:
RewriteCond %{QUERY_STRING} =http:/+www\.
But this also seemed to only catch the first type of URL.

As 'we' appear to be half way there now if there some syntax that will catch http%3A%2F%2Fwww wherever it is in the address?

Thanks again for all the advice given in this thread!

[edited by: Badger37 at 10:44 am (utc) on Mar. 17, 2008]

jdMorgan

5:45 pm on Mar 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Now when I follow a hijacked link in my logs I get a 404 from my site.

Why? This isn't really the desired function (see previous posts). You should be getting a 301-redirect to http://www.example.com/cgi-bin/axs/ax.pl (which I presumed was a valid URI.

> catch http%3A%2F%2Fwww wherever it is in the address

Something like:


# BLOCK attempts to use our server as a proxy, but allow valid absolute URIs
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /?http:// [NC]
RewriteCond %{THE_REQUEST} !^[A-Z]{3,9}\ /?http://([^.]+\.)*example\.com(:80¦443)?/ [NC]
RewriteRule .* - [F]
#
# Block URL injection attempts in request URL-path
RewriteCond $1 http\%3A(\%2F)+ [OR]
# Block URL injection attempts in query string
RewriteCond %{QUERY_STRING} http\%3A(\%2F)+
RewriteRule (.*) - [F]

You'll have to test this to make sure it blocks the bad guys but doesn't mess up any outgoing link tracking on your site.

Jim

[edit] Corrections as noted below. [/edit]

[edited by: jdMorgan at 7:42 pm (utc) on Mar. 23, 2008]

Badger37

6:54 pm on Mar 17, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



No luck with that I'm afraid.
I'm assuming the only changes I need to make to your code are to the pipe symbol and the example.com address?

Using your latest code produces a 404 on my site if the injected address starts with 'http%3A%2F%2Fwww'

http ://www.mysite.com/cgi-bin/axs/http%3A%2F%2Fwww.spammysite.com%2Fblog%2Fwp-content%2Fthemes%2Fsquares%2Forelura%2Fwageno%2F

Or if the injected address is added to the real out-going URL:

http ://www.real-outgoing-link.com/?from=http%3A%2F%2Fwww.spammysite.de%2Fcontent_system%2Fola%2Fitil%2F&to=LBG&action=search

Any ideas?

As I mentioned I'm no expert with this and the syntax means nothing to me, so I'm only following your sample code - but I do trust you :)

<EDIT> Added a space after http to stop the link!

[edited by: Badger37 at 6:57 pm (utc) on Mar. 17, 2008]

Busynut

12:09 am on Mar 19, 2008 (gmt 0)

10+ Year Member



Hi all....
I received a sticky mail inquiry about this thread and thought I should come back in here and report for anyone's benefit how things are going. The rewrite rule suggested by Jim and SteveWh didn't work for me.... naturally I assumed I did something really stooopid. Well, after staring at it until my eyes were blurry I finally realized what was wrong -- the query strings in my logs did NOT include the colon after the http. So that's why the rule wasn't catching any of the critters!

When I changed the rule to this it worked exactly as hoped:
RewriteCond %{QUERY_STRING} http
RewriteRule .* - [F]

It's stopping all requests in my php forum. However.... this rule still isn't stopping the spamming urls in my cgi script. I did some studying and found that sometimes you have to put an htaccess inside the cgi-bin folder itself in order to work in there - but that wasn't the solution for me. I never did solve that half of the puzzle. The cgi download script I'm using is old... so I'm sure it's obsolete by now and lacking in security routines. So I'm looking for a new one.

Hope this information helps. Thanks once again to everyone in here!

Badger37

5:57 pm on Mar 23, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi Jim,
I've tried your suggestions, am I a lost cause?

jdMorgan

6:26 pm on Mar 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> lost cause

No, but I certainly can't do this for you, since I'm likely several thousand miles away...
Please take this as cheerleading: This is your project, and you can do it.

Now what I can tell you is that the code itself produces a 403, and the only way I can think of to get a 404 out of it is if you have declared a custom 403 error page, but don't actually have one on your site. Your server error log should make this quite clear if it is in fact the problem.

Jim

Badger37

6:37 pm on Mar 23, 2008 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi Jim,
I do have a 403 and 404 error page (actually on this test site the 403/404 pages are the same which might be why I'm seeing a 404).

Did you see my post from the 17th which shows the latest results your .htaccess suggestions produced?

"I'm assuming the only changes I need to make to your code are to the pipe symbol and the example.com address?"

"Or if the injected address is added to the real out-going URL:
http ://www.real-outgoing-link.com/?from=http%3A%2F%2Fwww.spammysite.de%2Fcontent_system%2Fola%2Fitil%2F&to=LBG&action=search"

i.e. If the spam code is 'appended' to the out-going link then the suggested .htaccess code doesn't have any affect. Most of the problems are like this.

[edited by: Badger37 at 6:44 pm (utc) on Mar. 23, 2008]

jdMorgan

7:42 pm on Mar 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Perhaps this typo from my posts on the 16th and 17th?

RewriteCond %{QUERY_STRING} =http\%3A(\%2F)+
should be:
RewriteCond %{QUERY_STRING} http\%3A(\%2F)+

Jim

[edited by: jdMorgan at 7:43 pm (utc) on Mar. 23, 2008]

This 47 message thread spans 2 pages: 47