Forum Moderators: phranque
At present my htaccess file resembles:
RewriteEngine on
SetEnvIfNoCase Request_URI "^(.*)/$" valid-link=1
SetEnvIfNoCase Referer "(.*)mydomain\.com(.*)" valid-link=1
SetEnvIfNoCase Referer "(.*)mydomain\.co\.uk(.*)" valid-link=1
SetEnvIfNoCase REQUEST_URI "\.php$" valid-link=1
SetEnvIfNoCase REQUEST_URI "/public/" valid-link=1
<FilesMatch "\.*$">
order allow,deny
allow from env=valid-link
</FilesMatch>RewriteCond %{HTTP_USER_AGENT} leech1 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} leech2 [NC]
RewriteRule /* http://www.mydomain.com/leech.html [L,R]
What I would actually like to have is:
Allow links to any php pages from any referee OR
Allow links to any index folder* from any referee OR
Allow links to all other files from my domain/my php pages only OR
Let the public folder be completely visible, even for hot-linkers
THEN
Block website downloaders
*To allow cases where people type www.mydomain.com/bob or www.mydomain.com/bob/ instead of www.mydomain.com/bob/index.php
If anyone could give me any advice as to how I might update my htaccess statements to this affect, or even point out any errors I might have with what I have written so far, I would greatly appreciate it.
Welcome to WebmasterWorld!
Your code (and life) would probably be much simpler if you would choose either mod_access or mod_rewrite to accomplish what you need. As it is, you have mixed the two methods, which unnecessarily complicates things.
Please post specific questions, rather than asking for a code rewrite. We can help you get your code working, but can't write it for you -- See our charter [webmasterworld.com].
As to your question about ANDing conditions, this is easily done by omitting [OR] in RewriteCond directives -- the default multiple-RewriteCond behavior is AND.
Jim
I am looking to learn how I would put AND/OR SetEnvIf statements together, eg. What I was thinking was
SetEnvIf ... "..." valid-link
SetEnvIf ... "..."
SetEnvIf ... "..." valid-link
etEnvIf ... "..." valid-link
<FilesMatch "\.*$">
order allow,deny
allow from env=valid-link
</FilesMatch> to give (line1 OR (line2 AND line3) OR line4) but that doesn't sound right for writing an AND and I can't find an example of ANDing online.
I hadn't thought of using SetEnvIfs for the second part, but I think I'll wait until I can find a full list of commands before I try that.
You can use SetEnvIfs if you must, by employing negative logic, but mod_rewrite is more straightforward.
NOT ( (NOT A) OR (NOT B) ) is equivalent to A AND B
For an example using mod_rewrite, the following will return a 403-Forbidden response to the WGET user-agent referred from example.com, and only if a subdirectory is requested:
RewriteCond %{HTTP_REFERER} ^http://example.com
RewriteCond %{HTTP_USER_AGENT} ^WGET
RewriteRule ^.+/ - [F]
The references cited in our charter will lead you to the Apache documentation, where you will find documentation for all Apache directives.
Jim
Can you have several sets of RewriteConds followed by their own RewriteRule?
If you were to have a statement such as:
RewriteCond %{HTTP_REFERER} ^http://example1.com [OR]
RewriteCond %{HTTP_REFERER} ^http://example2.com
RewriteCond %{HTTP_USER_AGENT} ^WGET [OR]
RewriteCond %{HTTP_REFERER} ^http://example3.com
RewriteRule ^.+/ - [F] Would that give you (line1 OR (line2 AND line3) OR line4)?
If that is the case then to have (line1 AND (line2 OR line3 OR line4) would that have to be written in several seperate sets of statements? As you can tell I'm trying to get a feel for how the logic works :)
RewriteCond %{HTTP_REFERER} ^http://example1.com [OR]
RewriteCond %{HTTP_REFERER} ^http://example2.com
RewriteCond %{HTTP_USER_AGENT} ^WGET [OR]
RewriteCond %{HTTP_REFERER} ^http://example3.com
RewriteRule ^.+/ - [F]
Would that give you (line1 OR (line2 AND line3) OR line4)?
No, I believe it would give you ((line1 OR line2) AND (line 3 OR line4)). The [OR] flag is referred to as a "local OR" and its scope is limited to the line it's on and the line that follows.
Frankly, I avoid such constructs, and I'd recommend testing it to find out; For purposes of clarity and ease of maintenance, it's often better to avoid large combinatorial logic fur-balls by breaking up the code using multiple RewriteRules. Until your .htaccess file exceeds 15kB or so, you won't notice any performance issues (unless you get 500,000+ hits per day).
Note that you can also do a "local AND." This is tricky, and you must keep in mind that we are using string comparison here and account for anchoring issues, but this works, and would give you (line1 OR (line2a AND line2b) OR line3):
RewriteCond %{HTTP_REFERER} ^http://example1.com [OR]
RewriteCond %{HTTP_REFERER}<->%{HTTP_USER_AGENT} ^http://example2\.com[^<]+<->WGET [OR]
RewriteCond %{HTTP_REFERER} ^http://example3.com
RewriteRule ^.+/ - [F]
This is not a well-known technique, so I'd suggest you document it well if someone might come along after you in maintaining the site.
Jim
As hopefully my last question, what has confused me though is if I write the following it stops hotlinking but if I then use a website downloader the images and zip files are downloaded anyway:
RewriteCond %{HTTP_REFERER}!mysite\.com [NC]
RewriteCond %{REQUEST_URI}!\.php [NC]
RewriteRule ^.*$ http://www.mysite.com/blocked.html [L,R] I had thought that the referer wouldn't be my url and would be the program or a forged browser heading (causing it to be blocked), but I am now guessing that is not the case? Unless some form of caching of the called webpages is helping the downloader work.
Take a look at your raw server logs and review legitimate and unwelcome accesses; reviewing the referrer and user-agent strings in the log will make all of this much clearer.
One more issue: You have specified an external redirect in your RewriteRule. Understand that this requires handshaking with -- and the cooperation of -- the requesting user-agent. I'd suggest you use an internal rewrite instead.
Jim
Ah, I hadn't realised the redirect was external. That is useful.
[edited by: jdMorgan at 7:51 pm (utc) on Oct. 13, 2004]
[edit reason] speling [/edit]
One source of confusion that may be applicable here: HTTP is a "stateless" protocol. A single server request has no "memory" of any request that has gone before, and cannot have any effect on those that come after, unless some other method (such as cookies) are used to pass state information back to the browser to be included with the next request. Each request to your server, whether for a page, an image on that page, a client-side script, or a CSS style sheet, is a completely-separate request. So, if you block blank referrers, and the request comes from an ISP that caches images (such as AOL), then the request for the page will work, but the request for the images on that page won't -- because AOL's cache will block the referer, and your site will then block AOL's image requests. I cite AOL as a well-known example only; many ISPs and many corporations use the same kind of caching proxy to minimize their bandwidth utilization.
The bottom line is that blocking by referer works only well enough to discourage most hot-linkers, and cannot be made 100% reliable without causing problems for innocent, legitimate users. There are other methods, such as blocking IP addresses, blocking by behaviour (using scripts), and blocking by user-agent that can be used to supplement referrer blocking. Combining all of these methods provides fairly good access control against 'casual' hot-linkers.
If you need 100% access control, then you'll need to use cookies, or cookies and sessions, or password-protection.
So, I strongly suggest that you don't try to block blank referrers - It simply causes too many problems for legitimate users.
Jim