Forum Moderators: phranque
RewriteCond %{HTTP_USER_AGENT}!site1 [NC]
RewriteCond %{HTTP_USER_AGENT}!site2 [NC]
RewriteRule!^(\favicon\.ico¦403\.htm¦robots\.txt) - [F]
However if I add
RewriteCond %{HTTP_USER_AGENT} site1 [NC]
RewriteCond %{REMOTE_ADDR}!#*$!.#*$!.#*$!.#*$!
RewriteRule!^(\favicon\.ico¦403\.htm¦robots\.txt) - [F]
all of a sudden that site is getting 403'd when it asks for the same file.
What am I doing wrong?
That would rather depend on what you are trying to do...
It's not clear from the code because, for example, in your first you appear to be checking the requesting HTTP_USER_AGENT for a negative match with "site1". So this implies tht you wish to reject any browser or robot not named "site1" or "site2" (if it asks for a file not on your list).
Normally, one checks the HTTP_REFERER or REMOTE_HOST if checking for specific "site" names.
Then the additional code looks for a browser or robot named "site1" requesting any file not on your list, and rejects the request if it does not come from a computer at a specific IP address.
Further, the regular-expression pattern "\favicon\.ico" has an unexplained and probably unnecessary regex escape character "\" preceding the "f".
As a result it might be quite helpful to describe, in as precise terms as possible, exactly how you want the user-agent, remote-address, and file list to be used to control access.
Jim
The following very brief whitelist demonstrates the idea -- See the Googlebot rule for a method for checking the IP address range.
# Skip all following rule(s) if globally-accessible files requested
RewriteRule ^(favicon\.ico¦403\.htm¦robots\.txt)$ - [L]
#
# Otherwise, check against user-agent whitelist
# Major SE robots
RewriteCond %{HTTP_USER_AGENT}<>{REMOTE_ADDR} !^Mozilla/[5-6]\.[0-9]+\ \(compatible;\ Googlebot/[2-3]\.[0-9];\ \+http://www\.google\.com/bot\.html\)<>66\.249\.
RewriteCond %{HTTP_USER_AGENT} !^(msnbot(-media¦-News¦-Products)?¦MSNPTC)/[0-9]\.[0-9]
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/[5-9]\.[0-9]+\ \(compatible;\ (Yahoo!\ )?Slurp;
RewriteCond %{HTTP_USER_AGENT} !^Gigabot
# Major browsers
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/[4-5]\.[0-9]+\ \(compatible;\ MSIE\ [3-9]\.[0-9.]+
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/[4-5]\.[0-9]+\ \(.+;\ rv:([0-9]+\.)+[0-9a-z]+\)\ Gecko/20[0-9]{6}
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/[2-4]\.[0-9]+\ \[[a-z]{2}\](\ \(.+\))?
RewriteRule .* - [F]
Beware of line wrapping due to display width restrictions of the forum format; Each RewriteCond pattern must be all on one line.
Casual readers are warned that the above code is an example only; It will block many, many legitimate requests because the whitelists are far too exclusive as shown.
Jim