My site recently suffered a loss of image traffic due to images wrongfully being labeled explicit within Google's safesearch settings. In trying to first diagnose and then resolve this issue I've stumbled upon a can of worms and am left with one major question.
Should I block major search engines from being able to hotlink my images? Before I could answer that I had to dig into how the big search engines handle images and hotlinking in their image search features and, thankfully, the big three do it very similarly. To start at the begining this is typical hotlink protection code within htaccess
rewritecond %{HTTP_REFERER} !^$
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?google\.com [NC]
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?bing\.com [NC]
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?yahoo\.com [NC]
rewritecond %{HTTP_REFERER} !^http://(www\.)?example\.com [NC]
rewriterule \.(gif|jpe?g|png)$ - [NC,F]
What that does is block any website except your own from displaying your images but it will allow Google, Bing and Yahoo to continue displaying your images how they see fit.
How Google handles being blocked: They display your image anyway however instead of hotlinking directly to your image file they serve up a Google hosted thumbnail of the image. This also disables the "view full size" link in Google images since Google attempts to display it on their domain, a forbidden error is displayed instead. The link to your page works as it always did.
- Your images are still indexed just fine!
How is automated content affecting you? There are several article creation tools making the rounds with the typical "buy it now for just x amount, limited time, exclusive rights, yadda yadda" pitch and in trying to diagnose my original problem I found countless automated sites aggregating textual content and now including image hotlinks within articles. It was eye opening how many in fact.
Since these automated content creators scrape the search engines for content they are gathering your image urls from image search and hotlinking the image within automated content which is, inherently, 100% duplicate.
Why typical hotlink protection isn't enough anymore Hotlink protection stops such sites from displaying your images but it doesn't stop the inclusion of your image url within the code of these sites and the owners really don't care to remove it even if it's blocked. Google doesn't seem to care if the image renders or not, they just record the source code and go on their way which means your image is now associated with auto sites.
If the content on the auto-site is explicit in nature your image can quickly become labeled explicit, this I can confirm 100% with my recent image traffic issues, even if they never appear on the site at all.
So is it time to stop Google from hotlinking images too? While I can confirm that your images will still be indexed and rank just fine if you stop Google from hotlinking them it seems that there is no real reason to allow search engines to display a hotlinked image at all, it's dangerous to rankings. If Google is instead displaying a thumbnail because they are blocked from hotlinking then the automated content creator scripts are pulling the Google thumbnail url to hotlink instead of the real url which completely disassociates you from the problem.
Still, these are major search engines so I am hesitant to remove the following from my htaccess
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?google\.com [NC]
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?bing\.com [NC]
rewritecond %{HTTP_REFERER} !^http(s)?://(www\.)?yahoo\.com [NC]
but given the prolific explosion of automated content on the net as it gets easier and easier to obtain do you really have a choice?