Forum Moderators: phranque
I am new to mod_rewrite and am trying to get to grips with it.
I am trying to block certain robots from retrieving pages of specific directories.
I have tried using the following, I did not write this myself:
RewriteCond %{HTTP_USER_AGENT} ^NameOfBadRobot.*
RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.[8-9]$
RewriteRule ^/info/somedirectory/.+ - [F]
But cannot get it to work, it does nothing.
Can anyone see anything obvious that maybe wrong?
Many thanks
Welcome from WebmasterWorld!
There is a lot of minor stuff wrong with this code, and maybe more. I'd suggest you check out the references cited in our forum charter [webmasterworld.com].
RewriteCond %{HTTP_USER_AGENT} ^NameOfBadRobot.*
RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.[8-9]$
RewriteRule ^/info/somedirectory/.+ - [F]
Next, if NameOfBadRobot is start-anchored with "^" as shown, then the user-agent name must *start* with that string exactly.
Next, the RewriteRule will only be invoked if BOTH RewriteConds match. In other words, it will block that robot only if it comes from that remote IP address range -- Is that what you want? If not see the RewriteCond [OR] flag.
The alternate group [89] is equivalent to [8-9], since the numbers are contiguous. The shorter form is slightly faster.
Finally, the trailing "+" on the RewriteRule pattern isn't needed, either; You could just end the pattern with the period. (Note: Either way, this says to let the "index file" at "/" in that directory be spidered. It will block access if any characters follow "/info/somedirectory/". If you also want to block access to the index file at "/info/somedirectory/", then remove that trailing period.)
Fixing the minor stuff and leaving anchoring and robot specifics unanswered, we get:
RewriteCond %{HTTP_USER_AGENT} ^NameOfBadRobot
RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.[89]$
RewriteRule ^/info/somedirectory/. - [F]
Jim
>>>I'd suggest you check out the references cited in our forum charter.<<<
OK will read that in detail
>>>Next, the RewriteRule will only be invoked if BOTH RewriteConds match. In other words, it will block that robot only if it comes from that remote IP address range -- Is that what you want?<<<
Yes that is what I wanted
>>>The alternate group [89] is equivalent to [8-9], since the numbers are contiguous. The shorter form is slightly faster.<<<
Sorry should have put say, [3-9], the intent was to block a specific range. But your point is interesting to know!
Many, many, thanks for this info, off to try it out and do some more reading!
RewriteCond %{HTTP_USER_AGENT} ^NameOfBadRobot
RewriteCond %{REMOTE_ADDR} ^123\.45\.67\.3[0-9]$
RewriteRule ^info/somedirectory/. - [F]
This is blocking an IP range 123.45.67.30 to 39
But how can I block a bigger range say 123.45.60 to 123.45.69?
I have tried putting 123.45.6[0-9] and leaving the last numbers off but without the last ones it does not work. Do I have to add something?
Sorry to ask but I have read loads and can't find any alternatives and my brain really hurts...
Thanks.
RewriteCond %{REMOTE_ADDR} ^123\.45\.6[0-9]\.
The trailing "\." is not strictly required in this case; It is used to prevent ambiguity between, say
123.45.10.0 and 123.45.100.255, both of which would match the pattern "^123.45.10"
Always keep in mind that mod_rewrite is doing a lexical compare, not a numerical evaluation; It's only looking at REMOTE_ADDR as a string of characters, not as numbers.
Jim