homepage Welcome to WebmasterWorld Guest from 54.204.182.118
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Blocking Google Snippet
slipkid



 
Msg#: 4561257 posted 9:00 am on Apr 4, 2013 (gmt 0)

I am trying to block Google's snippet function in .htaccess and I'm having no success.

Here is a typical log entry:

66.249.84.5 - - [04/Apr/2013:01:56:56 -0400] "GET / HTTP/1.1" 200 5160 "-" "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google (+https://developers.google.com/+/web/snippet/)"

I have tried the following with no success:

SetEnvIf User-Agent ^\+https://www\.developers\.google\.com/\+/web/snippet/$ bad_bot

Can someone steer me in the right direction? Thanks.

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4561257 posted 8:38 pm on Apr 4, 2013 (gmt 0)

Why make it more complicated than it needs to be? If you're using mod_setenvif, a simple

BrowserMatch snippet bad_bot

should be enough.

Your pattern will always fail, because it has opening and closing anchors while the actual UA has more content both before and after.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4561257 posted 1:45 am on Apr 6, 2013 (gmt 0)

:: bump ::

Whoops! Now I need to go back and change the answer into a question. What is that
https://developers.google.com/+/web/snippet/
UA anyway? I've never set eyes on it until about a week ago, unless it was on a very long vacation. The web page leaves me none the wiser:

When a user shares a URL on Google+ or an app writes an app activity, Google+ attempts to fetch the content and create a snippet to provide a summary of the linked content.

Your web server will see a request with the user agent {et cetera}


Doesn't that make it sound like a preview? But all my logged snippets just pick up two things: the front page and the favicon. Thereby making mincemeat of my ordinary log-wrangling, which assumes that a page request followed by a favicon request is a human. The literal plus sign is a troublemaker too.

Hm. Maybe this all belongs in a different thread.

Simon84



 
Msg#: 4561257 posted 1:00 pm on May 14, 2013 (gmt 0)

Why on earth would you want to block this bot? This is Google coming along to build a proper preview for a link to your site that has been shared! If they get a better link preview to you, more people are likely to visit your site, thus more page views/conversions/users/whatever you're trying to acheive!

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4561257 posted 6:34 pm on May 14, 2013 (gmt 0)

The snippetbot is NOT a preview. Look at its behavior, not at its propaganda.

slipkid



 
Msg#: 4561257 posted 8:34 am on May 15, 2013 (gmt 0)

There is little that Goofy does these days that does not set off warning bells.

Here are two examples of hits to raw logs:

66.249.84.5 - - [15/May/2013:02:11:23 -0400] "GET / HTTP/1.1" 200 5160 "-" "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google (+https://developers.google.com/+/web/snippet/)"

66.249.84.5 - - [15/May/2013:02:11:23 -0400] "GET /favicon.ico HTTP/1.1" 200 648 "-" "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google (+https://developers.google.com/+/web/snippet/)"

Without a referrer, I have no way of knowing which of my web pages is being rendered with a snippet per the above UA.

As far as I know, users of my site may be reading the snippet and going elsewhere.

At the moment, snippets are not blocked -- I am willing to change my code if this feature of Goofy is actually causing a loss of genuine visitors.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved