homepage Welcome to WebmasterWorld Guest from 107.20.131.154
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 47 message thread spans 2 pages: < < 47 ( 1 [2]     
Is it safe to block hotlinking when there is no referrer?
Sgt_Kickaxe




msg:4541029
 6:19 pm on Jan 31, 2013 (gmt 0)

Google's new image search layout no longer sends a referrer when hotlinking your image, it used to. The result is that Google now shows your image hotlinked on their site instead of loading a cached copy on their own server.

This isn't ideal since scrapers generally grab the url of the image they are stealing and I'd much prefer they grab Google's cached url instead of my site's url.

Is it safe to go ahead and block images from displaying if there is no referrer? A blank referrer can happen in some situations such as when behind a company firewall.

I'm asking this from an SEO standpoint, not about code, but here is the htaccess that will be left if the blank referrer check is removed.

rewritecond %{HTTP_REFERER} !^http://(www\.)?example\.com [NC]
rewriterule \.(gif|jpe?g|png)$ - [NC,F]

 

indyank




msg:4543684
 3:13 am on Feb 8, 2013 (gmt 0)

in your case it is very simple. Use hotlink protection.

matrix_jan




msg:4543705
 3:49 am on Feb 8, 2013 (gmt 0)

Not that simple. Google hotlinks through https. There is no referral data, which forces me to select who has the right to access directly to my images.

indyank




msg:4543865
 5:21 pm on Feb 8, 2013 (gmt 0)

which forces me to select who has the right to access directly to my images.


that is exactly what I meant.

indyank




msg:4543866
 5:26 pm on Feb 8, 2013 (gmt 0)

As i understood it, you didn't want anyone to hotlink your images including google. So white-listing is the best option.

kleverkode




msg:4545778
 11:32 pm on Feb 14, 2013 (gmt 0)

@matrix_jan very true. There is an interesting solution of sorts in stackoverflow [stackoverflow.com] but have not used it myself, looks like there is some kind of logic behind it, maybe more experts on the topic than me can check it out.

I lost more than 60% of my traffic with the new search, it's really crazy.

lucy24




msg:4545792
 1:16 am on Feb 15, 2013 (gmt 0)

Ooh, that's kind of ingenious. It's based on going one step back from what most fixes use. You have to give g### image search a different file in the first place-- that is, a file with a different URL-- because that is the only sure way to tell if someone is looking at a search result. This part, of course, can't be done retroactively; you have to work it in slowly as they re-crawl your site.

There's one big risk though, and the post itself points it out:
just like other solutions it's up to Google to inte[r]pret it as cloaking and ban at their will

So it will only work as long as the googlebot doesn't realize that nobody else is getting redirected. It has to be a redirect, not a rewrite, because the whole point is to create a different URL.*

Now, why is this part of the discussion here rather than in the ongoing Google-Image-Search thread?


* You could theoretically do it as a rewrite at the original URL-- but only if you had some supplementary php business that cross-checked the filesize down to the last byte to determine whether you're dealing with the real file or the search engine's version.

JS_Harris




msg:4594993
 7:13 am on Jul 21, 2013 (gmt 0)

rewritecond %{HTTP_REFERER} !^http://(www\.)?example\.com [NC]
rewriterule \.(gif|jpe?g|png)$ - [NC,F]


That is basic hotlink protection at its simplest. It does not care if the referrer is blank or what your intentions are, if you're not on my site you're not seeing my image, period.

Effects - Google image bot passes your domain as referrer when loading images and so your images will be indexed *however* since Google.com cannot display said image they display their own lower resolution cache copy instead. Clicking on the image still takes you to your site, clicking on the 'visit site' link will still take you to your site but clicking on the 'view original image' link will fail because of the 403 forbidden code above.

Benefits
- Google indexes images without displaying your actual image, no resources *borrowed* etc
- Scrapers inadvertently steal/hotlink Google's cache copy instead of your original(cuts down on spam links too)
- links to your site work, links directly to image urls do not.


In my opinion this is how Google should work anyway. Google should never hotlink full resolution originals for their services, it's unethical to begin with.

I hope that answers why you don't need to block google's image bot via robots.txt if you don't want Google to hotlink your images(which makes it easy for others to find and hotlink too).

Note - the effect of hotlink protection is much more pronounced on Bing's image search because Bing links their images directly to your image url and not the page url the image is on. *shame on Bing*

lucy24




msg:4594999
 8:12 am on Jul 21, 2013 (gmt 0)

Google image bot passes your domain as referrer when loading images

Er, no it doesn't. Search engines request images "cold", without referer. Look at your logs.

since Google.com cannot display said image they display their own lower resolution cache copy instead

They do this anyway. When a human clicks on an item in Image Search, google immediately requests the image from your site-- but it isn't actually displayed until the searcher proceeds to the next step, "display the image". Meanwhile they see the cached version. The effect is to make the search engine-- and your own site!-- look good because the image is ready and waiting in user's browser cache, and comes up with no delay.

Bing links their images directly to your image url and not the page url the image is on

So does the new google image search. It was unrolled, oh, at least a year ago. Are you returning after a long absence? Or using an older browser? (Not necessarily de facto old. Just "old" according to google's UA detection.)

Rosalind




msg:4595047
 2:42 pm on Jul 21, 2013 (gmt 0)

I surf with the referrer blocked. It makes logging in or completing some forms difficult, although I don't believe it affects images on many sites. So humans with blank referrers are going to be used to a small amount of inconvenience.

One solution may be to redirect the image to a message that says, "you're seeing this because your referrer is switched off". If you want to go further, put up a link that leads to a cookie which will make the images visible again. If it's one click, it might work. Just don't ask people to re-engage their referrer, as Pinterest does, because that's too much hassle and may not be possible for some users.

JS_Harris




msg:4595065
 4:17 pm on Jul 21, 2013 (gmt 0)

Lucy, you're right that they load images 'cold' but they still manage to take a copy(will need to dig deeper to find out how). I've had hotlink protection on for ages, exactly as in the opening post, and Google indexes all images anyway. They display their own cache copy in image search results, never my original images hotlinked.

I've checked in both new and old browsers, I don't see any difference. Google links directly to my page from two places(the image is linked as is the visit page button) and to the image url once(view original image). With the hotlink protection mentioned here the 'view original image' fails(403 due to no referrer) but the links to the page the image is on do not.

As long as Google indexes the images using their own cache copy but still links the images to my pages then I have no plans to stop using hotlink protection.

Lets take a step back
Many webmasters were hit with a 60-70% image traffic loss when the new Google image layout was rolled out, it was discussed here and written about on searchengineland as well. An excellent writeup with with graphs was written here: [pixabay.com...] The reason for the loss was said to be because full sized full resolution images were on Google in a way that doesn't encourage visitors to actually leave Google image search. Your images used to be displayed on a new page with your website loaded in the background but now the images are loaded right into the image search results page without your site in the background. It's infinitely easier to switch to other images now with no benefit to webmasters, many images even have competing image thumbnails from other sites shown beside them.

Google countered the claims of traffic loss by stating that most webmasters were getting a 25% increase in image traffic due to now having two links lead to your pages instead of just one. The obvious discrepancy aside the fact remains that it's easier to stay on Google image search to see full sized full resolution images than it is to visit your site. Also, without your page being shown in the background there is nothing enticing the visitor to see where the image came from.

Solutions tried
- Watermark overlay images when loaded on Google image search with text such as 'click for full resolution'. This immediately increases your image CTR but traffic from this method drops steadily over the following months. Reason? Only google knows for certain but it's likely that your image metrics suffer and slowly lose rank, too many backpage buttons being pressed.

- Redirect 'view original image' link so that visitors land on the page the image is on and not on the image url. This is an involved process and it sends less image traffic to your site than watermark overlays do but the traffic does not decline in the same manner over time.

- Inline watermarking your images. Add a 20 pixel strip to the bottom of all images and, using css, float a 20 pixel div over this section so that visitors on your site do not see the watermark strip. The text in your watermark strip might read something like 'click to see this image on example.com' or something similar. Google image search will then display the full image with watermark strip but it doesn't impact your visitors at all, they never see it on your site. No official word on if this is considered showing visitors something different than you show search bots but it is effective and does get indexed.

- robots.txt block google's image bot? Surely giving yourself a 100% image traffic loss can't be the best solution.

In all of the above you can safely leave hotlink protection on, exactly as shown in the opening post, and it will not interfere with any of the current solutions. Why not? Image hotlink protection is a hurdle Google has decided to jump in order to stock their image search results. Since that's how they deal with image hotlink protection you can have it on and not lose search traffic.

My answer to the original question is that yes, it's safe to use hotlink image protection as it is a common practice in protecting images. Google ignores it and shows a copy of the image anyway.

lucy24




msg:4595099
 9:12 pm on Jul 21, 2013 (gmt 0)

One solution may be to redirect the image to a message that says, "you're seeing this because your referrer is switched off".

I do this in some directories :) But the redirect just leads to a generic "sorry" page-- the same one I put up in the rare case where a page requires either a cookie or some specific referrer. Since the server hasn't been given any special instructions about the page (html extension) the net effect is that ordinary in-page image requests will get a blank space, as if the image file didn't exist. Only direct requests (type-ins, bookmarks, some e-mail) will end up on the "Sorry" page. I can tell how rare they are because hardly any requests are accompanied by requests for the page's freestanding CSS file.

If you want to go further, put up a link that leads to a cookie which will make the images visible again.

If you've chosen to block referers, are you likely to let the site set a cookie? I tend to think of referer blocking, like UA obfuscation and proxy connections, as even further along the privacy scale.

I've checked in both new and old browsers, I don't see any difference.

Google thinks Camino is an old browser, probably thanks to the "like Firefox 3.6" in the UA string, so I get the old-style image search with the full page along the side. Other browsers get the new-style search.

On my personal site, image requests with "blank.html" as referer are rewritten to an administrative gif set to expire immediately. That means the server doesn't have to send the whole 10-100K image file-- but if the user does ask to view the image in isolation, the browser will display the real thing.

There's one aspect of traffic loss I haven't yet seen answered. It may be unanswerable. When people come to your site via an old-style image search, do they stick around? Visit more pages, click on ads? If all they ever did was look at the one picture, you haven't lost anything. You may even have gained, because the server is doing less work for the same result.

it's safe to use hotlink image protection as it is a common practice in protecting images. Google ignores it and shows a copy of the image anyway

I don't understand this line. Server-based hotlink protection doesn't work on the honor system like robots.txt; it's absolute. Come in with the wrong referer, and all you'll get is the blaring "No Hotlinks" image. If your hotlink routine exempts referer-less requests, then search engines have no way of knowing that you block hotlinks.

not2easy




msg:4595104
 9:29 pm on Jul 21, 2013 (gmt 0)

According to what I see in access logs, Bing Preview and Google Web Preview request not only the image, but all associated files from a page where the image is located - css, js and all. It was causing an extreme waste of bandwidth, invalid clicks and cookie stuffing so for awhile I just added |Preview| to the list of blocked User_agents. They got a 500 error but that's better than disabling AdSense. A visit to developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag gave me
<Files ~ "\.(png|jpe?g|gif)$">
Header set X-Robots-Tag "noarchive"
Header set Cache-Control no-cache
Header set Cache-Control no-store
</Files>


Because they are not hot-linking to the images, but using a cached version in the non-visitor's browser as if BingPreview was the user's browser. When someone clicks on your image in the Image Search results page, the largest version crawled is downloaded to the searcher's browser cache, but not shown unless viewer clicks on "View Original Image" (which used to take the viewer to your site) - then they get shown the largest, highest resolution image from your site - from their browser cache and still on the image search site's "lightbox" - with no way to get to your site although it is downloaded and shown behind the image. Exact details vary between the two Image Search services, but neither is more than a scraper as far as image owners' benefits are concerned.

lucy24




msg:4595118
 10:41 pm on Jul 21, 2013 (gmt 0)

But, but, but
:: splutter ::
Preview is a completely different process from image search. Google and Bing-- also Seznam's Screenshot Generator and probably a few others-- both give a thumbnail of the entire page.

If you let them, they will request absolutely everything including javascript. The more you think about it the less sense this makes, since most scripting comes down to either user interaction or feature detection-- neither of which has any meaning in a preview. The version of a page you see in Preview is what the robot would see if it were human, not what you would see in your own browser.

with no way to get to your site although it is downloaded and shown behind the image

Yes, that's the most annoying and inexplicable feature of the new Image Search. If you do go to the image, you end up on the image alone, on a page by itself, exactly as if you'd typed in the URL. And then you are stuck, with no way out except hitting the browser's Back button or closing the tab. Never mind the site: I can't understand how this benefits the search engine.

JS_Harris




msg:4595120
 11:04 pm on Jul 21, 2013 (gmt 0)

It doesn't benefit anyone but it does keep visitors on google when they backpage. There must be some data gathering mechanism in play or they wouldn't bother doing this to visitors. Google is always all about gathering data.

I don't understand this line. Server-based hotlink protection doesn't work on the honor system like robots.txt; it's absolute.


And if the goal is to stop the image from being hotlinked it works but Google still takes a copy of the image and displays the copy in image search, my images are indexed just fine despite Google being unable to hotlink them. How? I'm not entirely sure, the logs don't tell but the images are definitely indexed.

Try placing the code in the opening post temporarily into your .htaccess file and visit your images in Google image search, they will all be there still but no longer be hotlinked. Scrapers, using the image gallery kiddie script of the moment, will end up grabbing and likely linking to Google's version instead of yours. The reduction in bad links and in being outranked by your own images is heavenly, Google will never outrank the original with an image hosted on(and hotlinked from) Google. Traffic generated by clicking on the image, or on the 'visit site' link however still lands on your site.

lucy24




msg:4595182
 3:13 am on Jul 22, 2013 (gmt 0)

I don't understand what you mean by "google hotlink".

:: looking vaguely around for someone who can translate ::

JS_Harris




msg:4598347
 8:32 am on Aug 1, 2013 (gmt 0)

I'm not sure how to more clearly explain it than this:

- My images, and your images, and everyone's images are hotlinked by Google.com's image search engine. Your images, your servers and your bandwidth is used to display images to visitors on Google's website.

And I am not asking how to stop that from happening, I'm telling you that if you employ hotlink protection that your images will STILL be in Google image search however they will be linked from Google's copy on their own server instead. That solves the scraper problem as the cache copies are lower resolution and the source url scraper bots get from Google's copy is Google's, not yours.

Back on topic: To make hotlink protection work against Google you need to remove the blank referrer condition in htaccess, Google does not provide you with a referrer when loading your images for their visitors. Besides Google, who else might be affected by that?

lucy24




msg:4598377
 11:45 am on Aug 1, 2013 (gmt 0)

My images, and your images, and everyone's images are hotlinked by Google.com's image search engine. Your images, your servers and your bandwidth is used to display images to visitors on Google's website.

No, they're not. The images that google loads up when someone clicks on a result in Image Search are not the images that are actually shown to the searcher. You can rewrite image requests with "/blank.html" referers to a single-pix gif with no visible effect at all for the human at the other end. And the bandwidth involved in serving up that dot has got to be less than the bandwidth of serving up a 403.

This 47 message thread spans 2 pages: < < 47 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved