Welcome to WebmasterWorld Guest from 54.205.119.93

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

SnapPreviewBot

   
7:51 am on Jan 22, 2007 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



38.98.19.## - - [21/Jan/2007:03:54:12 -0500] "GET / HTTP/1.1" 200 3816 "http://www.apassion4jazz.net/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot"

Took every file from index.html including images, scripts, css. No request for robots.txt. Changed IP address D class each request.

11:02 pm on Jan 22, 2007 (gmt 0)

5+ Year Member



There are two earlier threads about Snapbot (I assume it is the same thing with a minor name change):

[webmasterworld.com...]
[webmasterworld.com...]

I have the whole of the 38. class denied in .htaccess - nothing good ever comes from there.

[edited by: Mokita at 11:06 pm (utc) on Jan. 22, 2007]

11:29 pm on Jan 22, 2007 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



keyplr,

Bunch of old threads on this provider:

[google.com...]

12:50 am on Jan 23, 2007 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



My logs show the first time they used "SnapPreviewBot" in the UA was 01/05/2007
1:01 am on Jan 23, 2007 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Thanks for the links to previous discussions. I always search the forums prior to posting and this is what I got: "Your search - "SnapPreviewBot - did not match any documents."

6:55 am on Jan 23, 2007 (gmt 0)

WebmasterWorld Administrator incredibill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That's because it's a NEW name in the user agent, see my post above yours.
2:30 am on Feb 8, 2007 (gmt 0)

10+ Year Member



re SnapPreviewBot, I discovered that someone linking to my site on their Wordpress blog had some sort of plugin installed which popped up a window on link mouseover. This popup window contained a preview of the target web page which was generated by snap.com on-the-fly (presumably they cache the snapshot for a while).

I could see in my logs hits coming from 38.* IP addresses with the user agent "Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9". The page and all graphics and stylesheets were downloaded, including favicon.ico, but robots.txt wasn't requested.

So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image.

12:55 am on Feb 9, 2007 (gmt 0)

5+ Year Member



So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image.

Correct - but its brother, Snapbot, does actively crawl pages - that is how they are indexed by Snap in the first place.

2:21 am on Feb 9, 2007 (gmt 0)

WebmasterWorld Senior Member keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I'm not happy they capture my content/images... but it appears a legit search service and toolbar. Haven't seen too many referrals yet however.
2:48 am on Feb 9, 2007 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Whoever named this "SnapPreviewBot" instead of "SnapPreviewAgent" made a bad mistake. They'll be explaining it's non-compliance with robots.txt continuously because of the name.

Jim

 

Featured Threads

Hot Threads This Week

Hot Threads This Month