homepage Welcome to WebmasterWorld Guest from 54.242.126.126
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
putting an end to hotlinking
techniques discussed earlier
dhaliwal

10+ Year Member



 
Msg#: 4541396 posted 1:28 pm on Feb 1, 2013 (gmt 0)

Hi Everyone

This is about hotlinking of images. The complete discussion was done on the following thread but i couldn't ask there as the thread was closed for discussion.

[webmasterworld.com...]

I have used the following code, but it doesn't work for me.
The image doesn't load, if its hotlinked.


# Mod_Rewrite below
RewriteEngine on
# If the referrer is NOT your site or allowed sites (below)
RewriteCond %{HTTP_REFERER} !^http://(www\.)?(yourownsite|otherexemptsite)\.com [NC]
# If the referrer is NOT empty (below)
RewriteCond %{HTTP_REFERER} !^$
# If the requested file name ends in .jgp, .jpeg, .gif, .png
# And the preceding conditions are true serve the .php file
# in place of the image.
RewriteRule \.(jpe?g|gif|png)$ http://www.example.com/the-php-file.php [NC,L]


the-php-file.php

<!DOCTYPE html>
<meta http-equiv="window-target" content="_top">
<title>Redirected</title>

<script type="text/javascript">
if(top != self) top.location.replace(location);
else { location.replace('http://www.example.com/'); }
</script>
</head>
<body>
<a href="http://www.exmaple.com/" target="_top">This is a hotlinked image click to visit the site and see the image.</a>
</body>
</html>

 

SevenCubed

WebmasterWorld Senior Member



 
Msg#: 4541396 posted 4:04 pm on Feb 1, 2013 (gmt 0)

I can't comment on the Javascript aspect but for the .htaccess (or where ever you apply it, httpd.conf, vhosts.conf) I've had this in place for a few years and it works as long as the browser sends the referrer.

RewriteEngine on
RewriteCond %{HTTP_REFERER} .
RewriteCond %{HTTP_REFERER} !^http://(www\.)?example\.com [NC]
RewriteRule \.(jpe?g|gif|png)$ - [NC,F]

You also might want to consider changing your output message to something more user friendly. For a non-technical person that "hot-linking" statement might be misinterpreted as meaning something ominous that they should not visit.

"This image is only available on the website of origin" would probably work.

topr8

WebmasterWorld Senior Member topr8 us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4541396 posted 5:41 am on Feb 2, 2013 (gmt 0)

for code within a webpage, when you call an image thus eg.
<img src='http://www.example.com/hotlinkimage.jpg'>

the browser is expecting a stream of data that represents the image, so it can write it to the page, if instead it gets javascript, it doesn't make sense to it and will just show a blank image.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4541396 posted 6:23 am on Feb 2, 2013 (gmt 0)

Redirects won't work on images called via <img src... (I experimented recently.) They only work on <a href...

The quoted code

RewriteRule \.(jpe?g|gif|png)$ http://www.example.com/the-php-file.php [NC,L]

is a redirect even though it doesn't carry the [R] flag. When you give the complete protocol and hostname, it is equivalent to [R=302]. If you absolutely must give a domain name, you are in Proxy Pass-Through territory. But I really doubt that's what you need. You need less, not more.

Whether you can rewrite an individual image to a page of any kind is a whole nother question. I suspect you can't do it anyway, so you will need to approach the issue from a different angle.

Catia



 
Msg#: 4541396 posted 7:37 am on Feb 3, 2013 (gmt 0)

According to my Apache logs, the new Google Images does not show a referrer. The field is just "-" meaning no data. I think this is a change. I'll have to go dig up some old logs, but I think it used to show up as Google/imgres or something like that. But given this fact, I don't see any way that these rewrite conditions are gonna block Google Images from hotlinking.

The only way I've been able to make it work by blocking all blank referrers, but I think this can also block legitimate users as some security schemes leave the referrer blank. Anybody have more details on that?

TheMadScientist over on the Google Images thread suggested that there was some way for mod_rewrite to parse the IP address and figure out that the request was sent from Google even though the referrer is blank. But, in my logs at least, the only IP that appears is from the computer doing the browsing... ie: the IP identifies me, not Google.

So unless mod_rewrite can access data that does not appear in the Apache logs (anybody know?) I can't figure out how you could possibly use htaccess to block google images specifically.

That's all I've got.

Catia



 
Msg#: 4541396 posted 7:53 am on Feb 3, 2013 (gmt 0)

p.s. I was sorta assuming that the point of this hotlink protection discussion (given the timing) is to protect from Google Images hotlinking the image. Sorry if I'm off topic here...

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4541396 posted 9:27 am on Feb 3, 2013 (gmt 0)

The problem with the IP aspect is that when Image Search hotlinks an image, the IP that comes through isn't google's own IP, it's the human user. Google itself won't show up in your logs at all. Some people might even want to do the opposite: allow the search engine itself-- so it can cache the image-- but block unknown quantities who arrive without a referer.

unless mod_rewrite can access data that does not appear in the Apache logs

The RewriteCond options include loads of stuff that doesn't come through in logs, including just about any aspect of the request header:

[httpd.apache.org...] (if you're lucky enough to be on 2.4 instead of 2.2, just change one digit in your address bar)

%{HTTP:header}, where header can be any HTTP MIME-header name, can always be used to obtain the value of a header sent in the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header ``Proxy-Connection:''.


If you go step by step through Image Search, the first two bits to show up in your logs are:

67.122.211.163 - - [03/Feb/2013:01:01:00 -0800] "GET /paintings/rats/blowups/largesnocone.jpg HTTP/1.1" 200 376 "http://www.google.com/blank.html" "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.10.289 Version/12.01"

This is the preliminary stage, where you get a screenful of variously cached or hotlinked images. I deliberately searched for something obscure so it wouldn't be likely to be cached from an earlier search. I have no idea what the 376 represents: it obviously isn't the filesize, and in fact looks more like what you'd get in a HEAD request.

That step is followed by

67.122.211.163 - - [03/Feb/2013:01:01:54 -0800] "GET /paintings/rats/blowups/largesnocone.jpg HTTP/1.1" 200 2980 "http://www.google.com/imgres?imgurl=http://www.example.com/paintings/rats/blowups/largesnocone.jpg&imgrefurl=http://www.example.com/paintings/rats/snocone.html&h=378&w=504&sz=39 {et cetera}

(The 2980 is still not the real filesize, but it's the result of an intentional rewrite on my part.)

I don't know what the request header for an image file looks like; I can barely spell my way through headers for ordinary pages. But someone else will know :)

helleborine

10+ Year Member



 
Msg#: 4541396 posted 2:45 pm on Feb 3, 2013 (gmt 0)

This is the referrer I get in my logs:

000.000.000.000 - - [02/Feb/2013:09:26:02 -0600] "GET /image.jpg HTTP/1.1" 302 221 "http://images.google.com/search?hl=en&site=&tbm=etc."

Hence

RewriteCond %{HTTP_REFERER} ^http://.*images\.google.*$ [NC]
RewriteRule .*\.(jpe?g|gif|bmp|png)$ http://mywebsite\.com [R]


However, Google substitutes the missing image with a blown-up thumbnail.

Catia



 
Msg#: 4541396 posted 5:27 pm on Feb 3, 2013 (gmt 0)

Hmmmm... very interesting. Sooo, I tried with different browsers. I cleared the cache before each one. The zeros are just an example... there's a real IP there.

If I do a google image search on one of my sites I get the following when using Chrome:

00.000.00.00 - - [03/Feb/2013:12:03:13 -0500] "GET /Pictures/nature/PineNeedlesSnow2.JPG HTTP/1.1" 200 49008 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17"

and when clicking on view original, the only entry is for favicon:

00.000.00.00 - - [03/Feb/2013:12:16:03 -0500] "GET /favicon.ico HTTP/1.1" 404 466 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.57 Safari/537.17"

In Firefox I get this, and no entry when clicking on view original:

00.000.00.00 - - [03/Feb/2013:12:17:34 -0500] "GET /cooking/KitchenScene.JPG HTTP/1.1" 200 13623 "-" "Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0.1"

BUT... If I use Internet Explorer I get this:

97.118.91.29 - - [03/Feb/2013:12:09:35 -0500] "GET /cooking/MacNCheeseFancy.jpg HTTP/1.1" 200 35403 "http://www.google.com/blank.html" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

followed by this if I click on view original image:

00.000.00.00 - - [03/Feb/2013:12:12:43 -0500] "GET /cooking/MacNCheeseFancy.jpg HTTP/1.1" 200 35403 "http://www.google.com/url?sa=i&rct=j&q=site%3Aexample.com&source=images&cd=&docid=90m4oaipR4GhwM&tbnid=nBS5yY36dxzeOM:&ved=0CAIQjBw&url=http%3A%2F%2Fwww.example.com%2Fcooking%2FMacNCheeseFancy.jpg&ei=0ZkOUbX0A8qDyAHPsYHgCw&bvm=bv.41867550,d.aWc&psig=AFQjCNF0jR1QLe4EDRiejSDG4B2VOt3nmA&ust=1359997750423096" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

Sooooo... it's looking like what shows up in the Apache logs is dependent on which browser you're using.

Which browsers are you guys using?

In any case, it's sounding like a comprehensive solution is gonna be a tad bit more involved - I'll try to plow through the Apache documentation, and see if I can figure out how to pull the relevant info from the header file... but it may take a while for me to translate all that technospeak into things my brain can comprehend!

BTW Lucy, the second number (in your case 376) is indeed supposed to be the amount of data transferred, although that does sound a bit small to be an image.

[edited by: Catia at 5:51 pm (utc) on Feb 3, 2013]

Catia



 
Msg#: 4541396 posted 5:35 pm on Feb 3, 2013 (gmt 0)

p.s. My server is running Apache 2.2.15 - don't know if that has any impact on all this or not

helleborine

10+ Year Member



 
Msg#: 4541396 posted 6:00 pm on Feb 3, 2013 (gmt 0)

double post

[edited by: helleborine at 6:01 pm (utc) on Feb 3, 2013]

helleborine

10+ Year Member



 
Msg#: 4541396 posted 6:01 pm on Feb 3, 2013 (gmt 0)

Indeed, I get this referer:

http://images.google.com/
only for Mobile/10B141 Safari/8536.25"

I can't seem to pick up Google Images for any other OS.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4541396 posted 11:01 pm on Feb 3, 2013 (gmt 0)

Are you logged in while searching? "images.google.com" should be the equivalent of "www.google.com" which is what you get instead of a complete referer string if the searcher is logged in. But you only get this part when the user actually goes to your site.

When you do your experimenting, pause about ten seconds between each stage so they're clearly distinct in logs. And look for the point in your logs when the whole page loads up. (I did my testing in gallery pages where it's just images, page, css each time.)

Make sure you're searching for something that isn't #1 in the results, because that one may be pre-loaded and will skew what you see in logs.

For fine-tuned experimenting you'd have to set images to no caching at all, meaning that if the user requests the same picture three times in a row there will be three separate server requests. Assuming, ahem, a compliant browser ;) (In other words, do all experimenting in something other than MSIE or-- for different reasons-- Chrome. Firefox is probably a good default.) Obviously this is not something you want to keep enabled permanently.

zerillos

5+ Year Member



 
Msg#: 4541396 posted 11:45 am on Jun 4, 2013 (gmt 0)

Here's an observation that might help. Visitos coming from google.com/blank.html and Bing's image search trigger a temporary block from my csf firewall.

I don't know the reason why, but every visitor coming from these two locations seem to be sending UDP packages that get them blocked by my firewall.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved