Welcome to WebmasterWorld Guest from 54.197.156.64

Forum Moderators: Ocean10000 & incrediBILL & keyplyr

Message Too Old, No Replies

Amazon Image Cache/0.5 libwwwperl/5.808

     
3:57 am on Oct 27, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 13, 2010
posts:170
votes: 0


Is this truly "Amazon" scraping my images?

Page:
/thumb.php?img=images/widget.jpg&w=150&h=150

IP:
72.21.217.33

User Agent:
Amazon Image Cache/0.5 libwwwperl/5.808


If so, what are they doing? I'm not an Amazon client.
6:49 pm on Oct 27, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5459
votes: 3


some interesting reading in two of the serps [google.com]

FWIW, this UA (libwwwperl) is one that most every beginner blacklists, and nothing suggests a difference of opinion from that first impression
9:26 pm on Oct 27, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 13, 2010
posts:170
votes: 0


wilderness,

That link only gave 3 results for me, and the most interesting looking one couldn't be reached.

I thought I had libwwwperl blacklisted, but it turns out, I (sort of) did:
SetEnvIfNoCase User-Agent "libwww-perl/" bad_bot

I will be updating the entry for the new version!
9:55 pm on Oct 27, 2012 (gmt 0)

Senior Member from GB 

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:May 14, 2008
posts:3145
votes: 4


Unless you have particular need to let SOME amazon IPs into your site, block ALL amazon IP ranges. There are a few lists of them in this forum - go for the latest.
10:16 pm on Oct 27, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13448
votes: 390


:: detour to own htaccess ::

Heh. I've got "libwww-perl" commented-out because it's what the Link Checker uses. But it would probably make more sense to un-comment the block (in my case part of a BrowserMatch list) and just restore the # when I actually do check links, which is not often.
10:24 pm on Oct 27, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5459
votes: 3


just restore the # when I actually do check links, which is not often.


Same procedure I use for Xenu.
3:23 am on Oct 28, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2001
posts:5459
votes: 3


That link only gave 3 results for me, and the most interesting looking one couldn't be reached.


One of the SERPS provided an explanation of how to use the Amazon Cache with WordPress pages to offer links to product catalogs.

FWIW, there were six results in the search. No idea why you saw less.
5:34 am on Oct 28, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:13448
votes: 390


I got six results-- but the first two are from the same domain and came up "missing" in the browser, and no. 4 (the least promising of the batch, I was just being thorough) threw a "page load error".

You can get those first two in -- hahaha -- cached versions. But they are not very interesting or useful. The WordPress link seems to work.
From the image information within the product data, an image for each product is fetched and staged. This is again implemented in perl making use of wget.

By amazing coincidence I've also got "wget" blocked. Or rather "Wget"; don't know why only that form.
1:34 am on Oct 31, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:July 26, 2006
posts:1619
votes: 0


Yea not liking the result about Product Image clouds at ALL!
I'd really hate to have to start water marking images again. GRRR.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members