Forum Moderators: phranque

Message Too Old, No Replies

Blocking garbage requests

using htaccess to filter garbage requests

         

Avo19

10:38 pm on May 24, 2009 (gmt 0)

10+ Year Member



Hello,
I hope someone can help me out here.
I've a WP blog that is getting thousands of what I call "garbage requests" each day to non-existent pages. An example from my logs

"GET /?query=administrator/components/com_virtuemart/export&command=search HTTP/1.1"
"GET /page/5/?query=class%3Dneww+target%3D_b...e%3DIm+neuen+Fenster&command=search HTTP/1.1"
"GET /?page=66/errors.php%3ferror=http://www.laurent-camping-cars.com//administrator/components/drivid.txt%250D%3f%3f HTTP/1.1"

I tried adding the following to htaccess, but still the requests slip through and get a "200". Obviously I'm messing up here.


RewriteCond %{REQUEST_URI} (/\\?query=).*$ [OR,NC]
RewriteCond %{REQUEST_URI} (/\\?page\/).*$ [OR,NC]
RewriteCond %{REQUEST_URI} (\&command=search)$ [OR,NC]
RewriteCond %{REQUEST_URI} (/\\?page=).*$ [NC]
RewriteRule .* - [F,L]

Any help appreciated.
cheers,
S

[edited by: Avo19 at 10:42 pm (utc) on May 24, 2009]

g1smd

11:05 pm on May 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The parameters are not available in REQUEST_URI.

You need to look in QUERY_STRING or even at THE_REQUEST.

Avo19

11:37 pm on May 24, 2009 (gmt 0)

10+ Year Member



Ok, tried this, but still no dice.

RewriteCond %{QUERY_STRING} (^query=).*$ [OR,NC]
RewriteCond %{QUERY_STRING} (^page\/).*$ [OR,NC]
RewriteCond %{THE_REQUEST} .*(\&command=search)$ [OR,NC]
RewriteCond %{QUERY_STRING} (^page=).*$ [NC]
RewriteRule .* - [F,L]

g1smd

4:43 pm on May 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



***
RewriteCond %{QUERY_STRING} (^query=).*$ [OR,NC] 

You might get better results with a syntax more like:

RewriteCond %{QUERY_STRING} &?query=([^&]+)&? [OR,NC] 

jdMorgan

5:52 pm on May 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like you're trying to guess the syntax here, and that never works. Get the Apache mod_rewrite documentation, spend two hours reading (all) of it, and save yourself a lot of time -- and quite possibly, a lot of money. mod_rewrite errors --no matter how small-- can easily destroy your search rankings, and this can be very expensive. It's put more than one company out of business...

RewriteCond %{REQUEST_URI} ^page/ [NC,OR]
RewriteCond %{QUERY_STRING} ^query= [NC,OR]
RewriteCond %{QUERY_STRING} &?command=search&? [NC,OR]
RewriteCond %{QUERY_STRING} ^page= [NC]
RewriteRule ^ - [F]

Omission of unnecessary character-escaping and end-anchors was intentional. [L] used with [F] is redundant.
The "&?" sequences are "soft anchors" used to prevent unwanted matching, for example, on "newcomand=searches". If any character precedes "command" or follows "search" then it must be an ampersand, but these leading or trailing characters are optional, and are not required to get a match.

Jim

[edited by: jdMorgan at 6:57 pm (utc) on May 25, 2009]

Avo19

6:51 am on May 26, 2009 (gmt 0)

10+ Year Member



Thanks for that. I have read the doco's, but coming from a newb background re: mod_rewrite, much of it fails me, especially when I think I've written the correct expression and it doesn't work.
Example
I'm looking to fail any request that contains "&command=search" in the query string.

From my little understanding of the doc's and what you've written, this condition

"RewriteCond %{QUERY_STRING} &?command=search$ [NC]"

should satisfy the rule

"RewriteRule ^ - [F]"

for this string

"GET /?query=_blogadata/include/struct_admin&command=search HTTP/1.1"

and give a forbidden. But it doesn't, and I'm lost as to why not.

There's many references on the web to the "voodoo" of mod-rewrite and that's how it appears to me at the moment.

jdMorgan

1:43 pm on May 26, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm looking to fail any request that contains "&command=search" in the query string.

From my little understanding of the doc's and what you've written, this condition

"RewriteCond %{QUERY_STRING} &?command=search$ [NC]"

should satisfy the rule

It should, but by end-anchoring the pattern (with the trailing "$"), you have specified that the query string must end with "command=search." So if the query string contains any additional parameters it won't match that RewriteCond pattern, and the rule won't be applied.

When posting here, it's a good idea to include any and all rules that might affect the URLs you are trying to rewrite/redirect. And if you haven't already done so, test with a very simple rule instead of trying to write and test one big complicated rule (or a big pile of rules) all in one go. Divide and conquer, as it were... so that you know each 'piece' works before adding another level of complexity.

I like to initially test with something like


RewriteEngine on
RewriteRule ^foo\.html$ http://www.WebmasterWorld.com/ [R=301,L]

to make sure that mod_rewrite is available and is working. Request "/foo.html" from your server, and you should land back here.

Also, make sure that you completely flush your browser cache before each test, to avoid having stale cached results confuse the test results. If the browser cache isn't flushed, and the browser finds a cached entry for the URL you request, then it will serve that cached response, and no request will be sent to your server. If no request is sent to your server, then none of your server-side code can have any effect.

Jim