Forum Moderators: phranque

Message Too Old, No Replies

how to block image-hotlinking in new context (?)

methods, successful in past, not working now

         

stapel

9:14 pm on Jun 3, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just discovered a page which copied (that is, "scraped") my original content and is hotlinking my original images.

I've had hotlink blocking in my .htaccess file for some time:

################################################################## 
# to allow certain white-hat parties through the hotlink
# protection
##################################################################
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^111\.222\.333\.444.*$ [NC]
RewriteCond %{HTTP_REFERER} !^www.good_domain.com/.*$ [NC]
...etc...
RewriteRule .*\.(jpg|gif)$ /hotlink.png [R,NC]
##################################################################

I've tried adding a RewriteCond for the specific domain and IP address:

################################################################## 
# to block scraper using Amazon 'viewfoo' service
##################################################################

RewriteCond %{HTTP_REFERER} ^viewfoo\.com/.*$ [NC,OR]
RewriteCond %{HTTP_REFERER} ^54\.67\.49\.123.*$ [NC]
RewriteRule .*\.(jpg|gif)$ /hotlink.png [R,NC]

##################################################################

...but my images are still showing up on the scraper's page.

The scraper's calls to my own server for my images are contained, along with the scraped text, within the following tags:

<div class="box-content document_holde ajbox_content textEditor" id="box_count_0" style="stuff...">

[mytext mytext mytext mytext...]
<img src="http://www.example.com/images/filename1.gif">
[mytext mytext mytext mytext...]

<ins id="aswift_0_expand" style="stuff...">
<ins id="aswift_0_anchor" style="stuff..">
<iframe name="aswift_0" width="300" height="250" id="aswift_0" frameborder="0" marginwidth="0" marginheight="0" scrolling="no" vspace="0" hspace="0" allowfullscreen="true" style="left: 0px; top: 0px; position: absolute;" allowtransparency="true">
</iframe>
</ins>
</ins>

[mytext mytext mytext mytext...]
<img src="http://www.example.com/images/filename2.gif">
[mytext mytext mytext mytext...]

</div>

Is the "iframe" the problem? Or something else? Either way, how do I go about blocking this?

Thank you.

Eliz.

[edited by: Ocean10000 at 8:15 pm (utc) on Jun 4, 2015]
[edit reason] examplfied. [/edit]

whitespace

9:05 am on Jun 4, 2015 (gmt 0)

10+ Year Member Top Contributors Of The Month



You might need to post more code? Something not quite right, as your first block should be blocking. Are you using a CDN?

RewriteCond %{HTTP_REFERER} !^www.good_domain.com/.*$ [NC]


The HTTP_REFERER is the full absolute URL, eg. "http://.......". So, the above rule will never match. Either include the full URL or remove the anchors ^ and $. For example:


RewriteCond %{HTTP_REFERER} !www\.good-domain\.com


RewriteCond %{HTTP_REFERER} ^54\.67\.49\.123.*$ [NC]


The HTTP_REFERER doesn't hold the IP address, so this will never match. It should be something like:

RewriteCond %{REMOTE_ADDR} =54.67.49.123


Only use the NC flag if you specifically need a case-insensitive match.

not2easy

1:03 pm on Jun 4, 2015 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If your content is being shown in an iframe then .htaccess hotlink protection is bypassed because the images are being called from your own site. You can append headers that specify "SAMEORIGIN" by adding this to your .htaccess file:
 Header append X-FRAME-OPTIONS "SAMEORIGIN" 


An iframe breakout javascript is another alternative, but it would need to be added to the header of each page and is more useful for small static sites.

lucy24

4:13 pm on Jun 4, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Is the "iframe" the problem?

I don't see how it can be, here, because the hotlinked material isn't inside the iframe.

Under what circumstances would a referer ever come through as a numerical IP instead of a hostname?

stapel

9:02 pm on Jun 4, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



not2easy said:
If your content is being shown in an iframe then .htaccess hotlink protection is bypassed because the images are being called from your own site.

But that's the thing: it isn't the whole page that's framed. The textual content has been scraped and reposted on the remote site. It's only the images which are being called directly from the remote site. Granted, the call is coming from inside some sort of framing, but the call should, I'd thought, be blocked by the existing .htaccess coding.

whitespace said:
Something not quite right, as your first block should be blocking.

That's kinda my point: I don't understand what's different in this particular case. (And no, I'm not using a CDN.)

not2easy said:You can append headers that specify "SAMEORIGIN" by adding this to your .htaccess file:

Header append X-FRAME-OPTIONS "SAMEORIGIN" 

Thanks. I've added the code to my .htaccess file. (I can't verify if it's working, though, because the scraper has decided to password-protect "his" content.)

whitespace said:
An iframe breakout javascript is another alternative....

If the entire page had been framed, yes, this (existing) coding would have ("should have"?) been helpful. But since it's "only" the images that are being called from within the frame, I think that the Javascript never comes into play.

lucy24 said:
Under what circumstances would a referer ever come through as a numerical IP instead of a hostname?

Dunno. I was just trying to be thorough. ;-)

tangor

9:25 pm on Jun 4, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



At present, instead of attempting to block, create a "STOLEN IMAGE" for each of the compromised images, change the names of the images you wish to keep, update your content, then lock down ALL images to SAMEORIGIN and other suggestions.

First step is to stop (or make unpleasant....some webmasters might make those "stolen images" pr0n of some kind)... and if you want to risk going to war, include "Stolen by WEBSITENAME".... Pick your poison.... but stop the theft. A text image is much smaller than actual photo/art, so that keeps the bandwidth in reason.