Forum Moderators: phranque

Message Too Old, No Replies

Problems With Rewrite

trying to do image protection yet allow SE cache to access images

         

old_expat

5:14 am on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello,

I am hoping that someone can help me out. Basically, I want to prevent sites other than major SEs from hot linking to my images.

Online communities seem to really like my maps and every time one of those pages get viewed, I get zapped with 15k - 60K of bandwidth. And they never include a link. Also, there seems to be so much of it, it has a minor effect on my stats.

I got the image protection rewrite portion of this script over on Webmaster General, but thought subsequent discussion might be better here.

I already had the HTTP_HOST rewrite in place, then added the HTTP_REFERRER.

I added the script to my .htaccess file and looked at my Google cache .. no images. So I figure it must be doing the first part of the job (preventing hot linking), but it's excluding at least 1 SE (Google).

Here is my script:

<*> - I fixed the broken pipes in this line when I pasted it into my file. I didn't see any anywhere else.

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www.mysite.com [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*) [mysite.com...] [L,R=301]

RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER}!^http?://(www\.)?mysite\.com [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?google\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?yahoo\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?msn\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?wisenut\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?gigablast\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?ask\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?aol\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?alltheweb\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?hotbot\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?teoma\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?altavista\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?looksmart\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?lycos\.[^/]+ [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^\?]+)?netscape\.[^/]+ [NC]
<*>RewriteRule \.(jpe?g¦png¦gif)$ /myad.gif [NC,L]

Help really appreciated.

jdMorgan

1:07 pm on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just add:

RewriteCond %{HTTP_REFERER} !^http://216\.239\.(3[2-9]¦[45][0-9]¦6[0-3])\..*www\.mysite\.com

All that you need to do is to look up unexpectedly-blocked accesses (such as this Google cache problem) in your raw access log, identify the referrer, and if it's an IP address, then look up the entire range for that IP address in ARIN, and construct a RewriteCond to cover that case.

The bit at the end including your site's domain is used to be sure that the G cache is being used to view a cache of your site, and not someone else's (who hotlinked to your site).

Change the broken pipe "¦" characters above to solid pipes before use.

Jim

old_expat

4:29 pm on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello JDM,

I put the code in and got the following error message

RewriteCond: bad argument line '%{HTTP_REFERER}!^http://216\\.239\\.(3[2-9]¦[45][0-9]¦6[0-3])\\..*www\\.mysite\\.com'\n

Sorry, I just don't understand it. Sorry to be so dense.

Thanks anyhow.

jdMorgan

4:33 pm on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, I took that off a live server, so it should work -- with the caveat that you noticed the instructions about broken pipes in my post above. That line should be added before or after the other similar RewriteConds in your code.

Jim

old_expat

8:55 pm on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi jd,

Added it as the last line .. broken pipes fixed. Maybe I'm just snake bit.:(

jdMorgan

9:06 pm on Aug 22, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Do you have a space between "}" and "!" in that line? My post shows it, but the error message you posted does not.

Jim

jdMorgan

12:41 am on Aug 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



One more thing --this just raised in another thread-- is that you'll need to exclude your "special image" from being rewritten by adding another RewriteCond:

RewriteCond %{REQUEST_URI} !^/myad\.gif$

Otherwise, your code will loop.

Jim

old_expat

5:46 am on Aug 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Do you have a space between "}" and "!" in that line? My post shows it, but the error message you posted does not."

yes, the space was there in the .htaccess file

I don't know what happened with the error message .. and showing all the extra '/' and '\' characters.

Maybe something weird .. I'm on a new server.

jdMorgan

8:11 pm on Aug 25, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Let's try a simplified version of the second ruleset:

RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^http?://(www\.)?mysite\.com [NC]
RewriteCond %{HTTP_REFERER} !^http?://([^/\?]+)?google\. [NC]
RewriteCond %{HTTP_REFERER} !^http?://([^/\?]+)?yahoo\. [NC]
RewriteCond %{HTTP_REFERER} !^http?://([^/\?]+)?msn\. [NC]
RewriteRule \.(jpe?g¦png¦gif)$ - [F]

This will simply return a 403-Forbidden in response to any htolinking request, and result in a broken-image icon displayed by the browser.

Flush your browser cache before and while testing any changes to your access-control code.

Jim

old_expat

5:15 am on Aug 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi jdMorgan,

With this one, the images didn't show on google.

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www.mysite.com [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*) [mysite.com...] [L,R=301]

RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER}!^http?://(www\.)?mysite\.com [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?google\. [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?yahoo\. [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?msn\. [NC]
RewriteRule \.(jpe?g¦png¦gif)$ - [F]

This was showing in the URL window when the page cache was being displayed

[64.233.187.104...]

So I added the extra line and it didn't work again. Maybe I formatted the extra line wrong?

RewriteEngine on
RewriteCond %{HTTP_HOST}!^www.mysite.com [NC]
RewriteCond %{HTTP_HOST}!^$
RewriteRule ^(.*) [mysite.com...] [L,R=301]

RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER}!^http?://(www\.)?mysite\.com [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?google\. [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?64.233.187.104\. [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?yahoo\. [NC]
RewriteCond %{HTTP_REFERER}!^http?://([^/\?]+)?msn\. [NC]
RewriteRule \.(jpe?g¦png¦gif)$ - [F]

And I changed the broken pipes.

jdMorgan

5:27 am on Aug 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The pattern has to match the referrer -- maybe time to hit the books on regular expressions? (Link in our forum charter)

RewriteCond %{HTTP_REFERER} !^http?://64\.233\.187\.104 [NC]

But they may not use that IP to fetch the image, only to get to the server that fetches it. I think you may find a different IP address range in your server access log along with those image requests from the cache; Something in the 216.239 block:

RewriteCond %{HTTP_REFERER} !^http://216\.239\.(3[2-9]¦[45][0-9]¦6[0-3])\.

Also, this is for page cache only. Googlebot-image spiders from a different range, too.

Jim

old_expat

11:24 am on Aug 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



*woof* *woof*

That works, Jim! I tested it on one of my other sites as well. I really *truly* do appreciate the way you hung in there with me.

"maybe time to hit the books on regular expressions? (Link in our forum charter) "

I will .. right after I move 12 websites to a new host .. probably sometime this weekend/monday. I'm sure I'll get plenty of practice doing my 301 redirects plus these image protection. This experience has made me realize I'm not *even* ready to have a dedicated server.

As for Google images, I have my doubts whether that traffic is traffic that I want.

As time permits I will look up the appropriate IPs for other SE cache servers.

Again, Jim .. thanks big time!

dave

old_expat

11:29 am on Aug 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"All that you need to do is to look up unexpectedly-blocked accesses (such as this Google cache problem) in your raw access log, identify the referrer, and if it's an IP address, then look up the entire range for that IP address in ARIN, and construct a RewriteCond to cover that case."

And I'm going to learn how to do this as well!