Forum Moderators: phranque
In the last few weeks I've noticed a new kind of spamming taking place... rather than 'referer spam' - the spam url is in the page requests. I'm getting dozens of these a day - they are hitting my phpbb forum as well as a cgi download script. Here's a couple examples:
/forum/viewforum.php?f=http%3A%2F%2Fwww.spammer.co.uk%2Fforum%2Flovuqo%2Fzil%2F
Http Code: 200 Date: Feb 04 04:53:48 Http Version: HTTP/1.0 Size in Bytes: 5967
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)
/cgi-bin/sample/download_script.pl?http%3A%2F%2Fwww.spammer.com%2Fadmin%2Fcorreo%2Fenaq%2Fecib%2F
Http Code: 200 Date: Feb 04 07:04:32 Http Version: HTTP/1.0 Size in Bytes: 1175
Referer: -
Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)
These page requests are resulting in Not Found errors and they are leaving dozens of spam urls each visit - I've been looking for a way to block them and havent' yet succeeded.
I tried this and it isn't working for me (still getting the Not Found error rather than Forbidden):
RewriteCond %{QUERY_STRING} ^.+http:
RewriteRule .* - [L,F]
Any advice on how to block these?
This is all possibly related to 'cross scripting' attacks. The attacks seem to disappear for a few weeks and then return. As I've mentioned in a few places here, I can see others are also seeing the same problem but can't see where anyone has got to grips with it. For example I see exactly the same problem as Busynut reported in the first post in this thread.
Think I'll give up for the time being :(
Due to using mod_rewrite for seo, there are no obvious PHP pages in this section. Adding index.php?start=20 at the end of the subdirectory means they are looking for/attempting something.
How can I use .htaccess / mod_rewrite to either redirect to an error page or redirect the URL back to domain.com/dir/subdir/ (or even the home page) if someone else attempts this? I'm already using the following:
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]
Is there a way to also incorporate what I need to the existing code or would it require another RewriteCond/Rule?
These guys really bug me so any help would be most appreciated!
EG
There's nothing inherently malicious in those requests. I.e. they're not hack attempts.
When you use mod_rewrite for SEO, you're probably rewriting pages that look like "this-thread-topic.html" to "index.php?topic=NNNN". Remember that your rewritten requests are sent back to your server in the new "index.php" form. If you start banning requests that use the .php form, you'll be banning the legitimately rewritten requests, too.
Thanks,
EG
That is, it will strip off the "index.php" and the query string, and redirect to what is left. But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\?start=[0-9]+
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]
[edit] Corrections as noted below. [/edit] [1][edited by: jdMorgan at 7:46 pm (utc) on June 2, 2008]
-----
Here's skeleton code for what you want to do, based on your example above. This should be a 2nd RewriteCond/Rule section, in order to keep things modular and not make any one section too complicated.
You can make it more general by using a more generalized regular expression for the part that says "start=20".
#RewriteCond %{REQUEST_URI} ^/dir/subdir/index\.php$ [NC]
RewriteCond %{QUERY_STRING} ^start=20$ [NC]
RewriteRule .* - [F]
#You'd redirect to other pages by using one of the following RewriteRules instead, but consider whether it might be a robot making these requests. If it is, why bother redirecting?
#RewriteRule ^(.*)$ http://example.com/ [R=301,L]
#RewriteRule ^(.*)$ http://example.com/dir/subdir/ [R=301,L]
But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.
Any code which does not check the client request header is likely to interfere with the pre-existing rules that rewrite "friendly" URLs to the back-end script. The result will likely be an "infinite" loop of rewrite/redirect...
Jim
But it will only do this if index.php URL is directly-requested by a client (browser or robot); This is to prevent interference with your existing internal rewrites.
This is what I currently have in .htaccess. I'm wondering if it should be consolidated or if okay as is?
#Remove question mark if blank query string
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^?]*\?\ HTTP/
RewriteRule (.*) http://www.example.com/$1? [R=301,L]
#Prevent spam http in page request
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]
#Redirect requests for index.php and start numbers
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1? [R=301,L]
Thanks again,
EG
Put your three rules in this order:
1) "Prevent spam http in page request"
2) "Redirect requests for index.php and start numbers"
3) "Remove question mark if blank query string"
The reasoning is that rule #1 (after re-ordering) will denies access, so no need to bother redirecting the client.
Rule #2 will remove *any* query string, so there's no need to check it to see if it's blank.
Rule #3 then removes blank query strings only if neither of the previous two rules were applied.
Jim
When I tried it I got a 503 Service temporarily unavailable error page. I assume that is a hack attempt and can't be good for the server, correct? This spammer did it four times and then downloaded a bunch of pages in seconds. I've banned the IP but would like to know if I can prevent this from happening by extending what I already have in my .htaccess or adding a new line. I'm still not quite sure why it wasn't stopped by this:
#Prevent spam http in page request
RewriteCond %{QUERY_STRING} https?:
RewriteRule .* - [F]
Hackers and spammers tend to unnerve me so any help would be much appreciated!
RewriteCond %{QUERY_STRING} https?[:%]
-or-
RewriteCond %{QUERY_STRING} https?(:¦%3A) [NC]
As usual, replace the broken pipe "¦" character with a solid pipe before use.
The 'effect' of this URL injection attempt depends entirely on what your page.php script might do with a URL. If that script is written to disallow references to domains outside your own, then you're fine. If it accepts that URL and 'includes' the file at that URL, you could be in really big trouble.
Jim
So in my case I was more comfortable with returning a "400 Bad Request" and closing the connection immediately if there is an unusual portion of query string. Just to make sure there is no where for an injection attack to go.
Not to mention that I was getting some legit bots that were starting to try the same query strings as if they were valid pages on my site. The 400 should knock those bogus query strings out of an SE's database pretty fast and prevent them from being passed on.
One thing I noticed about the site in question recently was that my Google Alert for that domain name has been sending me links to pages that contain my domain in some very strange text blocks that look like some sort of post results but don't look normal. It could be some sort of list of sites that are good targets for this type of abuse.
Since adding my block, I am not getting nearly as many Google Alerts of that type anymore.
Has anyone else set up an alert for their domain name and seen this type of alert show up?