homepage Welcome to WebmasterWorld Guest from 54.196.69.189
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor
Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
SnapPreviewBot
keyplyr




msg:3227320
 7:51 am on Jan 22, 2007 (gmt 0)

38.98.19.## - - [21/Jan/2007:03:54:12 -0500] "GET / HTTP/1.1" 200 3816 "http://www.apassion4jazz.net/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7 SnapPreviewBot"

Took every file from index.html including images, scripts, css. No request for robots.txt. Changed IP address D class each request.

 

Mokita




msg:3228153
 11:02 pm on Jan 22, 2007 (gmt 0)

There are two earlier threads about Snapbot (I assume it is the same thing with a minor name change):

[webmasterworld.com...]
[webmasterworld.com...]

I have the whole of the 38. class denied in .htaccess - nothing good ever comes from there.

[edited by: Mokita at 11:06 pm (utc) on Jan. 22, 2007]

wilderness




msg:3228184
 11:29 pm on Jan 22, 2007 (gmt 0)

keyplr,

Bunch of old threads on this provider:

[google.com...]

incrediBILL




msg:3228246
 12:50 am on Jan 23, 2007 (gmt 0)

My logs show the first time they used "SnapPreviewBot" in the UA was 01/05/2007

keyplyr




msg:3228254
 1:01 am on Jan 23, 2007 (gmt 0)

Thanks for the links to previous discussions. I always search the forums prior to posting and this is what I got: "Your search - "SnapPreviewBot - did not match any documents."

incrediBILL




msg:3228411
 6:55 am on Jan 23, 2007 (gmt 0)

That's because it's a NEW name in the user agent, see my post above yours.

abates




msg:3245939
 2:30 am on Feb 8, 2007 (gmt 0)

re SnapPreviewBot, I discovered that someone linking to my site on their Wordpress blog had some sort of plugin installed which popped up a window on link mouseover. This popup window contained a preview of the target web page which was generated by snap.com on-the-fly (presumably they cache the snapshot for a while).

I could see in my logs hits coming from 38.* IP addresses with the user agent "Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9". The page and all graphics and stylesheets were downloaded, including favicon.ico, but robots.txt wasn't requested.

So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image.

Mokita




msg:3247060
 12:55 am on Feb 9, 2007 (gmt 0)

So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image.

Correct - but its brother, Snapbot, does actively crawl pages - that is how they are indexed by Snap in the first place.

keyplyr




msg:3247110
 2:21 am on Feb 9, 2007 (gmt 0)

I'm not happy they capture my content/images... but it appears a legit search service and toolbar. Haven't seen too many referrals yet however.

jdMorgan




msg:3247126
 2:48 am on Feb 9, 2007 (gmt 0)

Whoever named this "SnapPreviewBot" instead of "SnapPreviewAgent" made a bad mistake. They'll be explaining it's non-compliance with robots.txt continuously because of the name.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved