Welcome to WebmasterWorld Guest from 54.226.32.234

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Help understanding a rewritecond

     

rominosj

1:08 am on Dec 13, 2011 (gmt 0)

10+ Year Member



Hi,

I have found this RewriteCond on some blog that is supposed to help me block bad bots or agents, and would like some help understanding it:

=======================
RewriteCond %{HTTP_REFERER} ^$ [NC]
RewriteCond %{HTTP_USER_AGENT} ^$ [NC]
RewriteRule .* - [F]
=======================

What does the % sign do in RewriteCond
And how about the dash (-) for the RewriteRule after .*

Thanks!

SteveWh

1:43 am on Dec 13, 2011 (gmt 0)

5+ Year Member



%{ NAME_OF_VARIABLE } is the standard way of referencing the predefined variables: [httpd.apache.org...]

The dash (where the URI of the substitution page would usually be) is a placeholder (or there might be a better name for it) indicating that there is no page to rewrite to. That line is the standard format for a Forbidden response.

The effect of your code is to give a 403 Forbidden response if both the referer and user-agent are blank, no matter what the requested page was.

In this case, because no text is actually being matched, the two [NC] (No Case = case insignificant) are unnecessary.

lucy24

3:20 am on Dec 13, 2011 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I'd modify both lines to

^-?$

because most blank referers and UAs aren't truly blank. They almost always come through as a single hyphen-- just like the one in your Rule.

Has anyone ever met a blank UA that took the trouble to send a referer? You can achieve pretty much the same thing in half the server time by only excluding blank UAs.

Now then. Some people will disagree with this, but I like to constrain rules as tightly as possible. The chances of a robot swinging by "cold" and demanding all your image files are, for most sites, vanishingly small. So you can express the "pattern" part of the Rule as, for example,

\.html$

(substituting your actual filename extension) instead of .* --with some further tweaks depending on whether directory indexes have already been added before the request reaches mod_rewrite. If they haven't, you also have to allow for requests ending in slash, or empty requests-- meaning your front page in mod_rewrite-speak.

The idea is that if someone is asking for pictures and stylesheets, they are either an authorized robot or someone who has already been allowed to land on a page, so there is no need to make the server continue evaluating conditions.

g1smd

8:21 am on Dec 13, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I do have a situation where a bot with blank UA and blank referrer that comes back to the site several times per day and attempts to pull just one image more than a dozen times in under a minute. It's been doing it for years. Total mystery.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month