| 11:02 pm on Jan 22, 2007 (gmt 0)|
There are two earlier threads about Snapbot (I assume it is the same thing with a minor name change):
I have the whole of the 38. class denied in .htaccess - nothing good ever comes from there.
[edited by: Mokita at 11:06 pm (utc) on Jan. 22, 2007]
| 11:29 pm on Jan 22, 2007 (gmt 0)|
Bunch of old threads on this provider:
| 12:50 am on Jan 23, 2007 (gmt 0)|
My logs show the first time they used "SnapPreviewBot" in the UA was 01/05/2007
| 1:01 am on Jan 23, 2007 (gmt 0)|
Thanks for the links to previous discussions. I always search the forums prior to posting and this is what I got: "Your search - "SnapPreviewBot - did not match any documents."
| 6:55 am on Jan 23, 2007 (gmt 0)|
That's because it's a NEW name in the user agent, see my post above yours.
| 2:30 am on Feb 8, 2007 (gmt 0)|
re SnapPreviewBot, I discovered that someone linking to my site on their Wordpress blog had some sort of plugin installed which popped up a window on link mouseover. This popup window contained a preview of the target web page which was generated by snap.com on-the-fly (presumably they cache the snapshot for a while).
I could see in my logs hits coming from 38.* IP addresses with the user agent "Mozilla/5.0 (compatible; SnapPreviewBot; en-US; rv:18.104.22.168) Gecko/20061206 Firefox/22.214.171.124". The page and all graphics and stylesheets were downloaded, including favicon.ico, but robots.txt wasn't requested.
So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image.
| 12:55 am on Feb 9, 2007 (gmt 0)|
|So it doesn't seem to be a standard spider as far as it doesn't actively crawl a page until someone tries to view the preview image. |
Correct - but its brother, Snapbot, does actively crawl pages - that is how they are indexed by Snap in the first place.
| 2:21 am on Feb 9, 2007 (gmt 0)|
I'm not happy they capture my content/images... but it appears a legit search service and toolbar. Haven't seen too many referrals yet however.
| 2:48 am on Feb 9, 2007 (gmt 0)|
Whoever named this "SnapPreviewBot" instead of "SnapPreviewAgent" made a bad mistake. They'll be explaining it's non-compliance with robots.txt continuously because of the name.