Welcome to WebmasterWorld Guest from 54.198.93.179

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Amazon Image Cache/0.5 libwwwperl/5.808

     
3:57 am on Oct 27, 2012 (gmt 0)



Is this truly "Amazon" scraping my images?

Page:
/thumb.php?img=images/widget.jpg&w=150&h=150

IP:
72.21.217.33

User Agent:
Amazon Image Cache/0.5 libwwwperl/5.808


If so, what are they doing? I'm not an Amazon client.
6:49 pm on Oct 27, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



some interesting reading in two of the serps [google.com]

FWIW, this UA (libwwwperl) is one that most every beginner blacklists, and nothing suggests a difference of opinion from that first impression
9:26 pm on Oct 27, 2012 (gmt 0)



wilderness,

That link only gave 3 results for me, and the most interesting looking one couldn't be reached.

I thought I had libwwwperl blacklisted, but it turns out, I (sort of) did:
SetEnvIfNoCase User-Agent "libwww-perl/" bad_bot

I will be updating the entry for the new version!
9:55 pm on Oct 27, 2012 (gmt 0)

WebmasterWorld Senior Member dstiles is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Unless you have particular need to let SOME amazon IPs into your site, block ALL amazon IP ranges. There are a few lists of them in this forum - go for the latest.
10:16 pm on Oct 27, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: detour to own htaccess ::

Heh. I've got "libwww-perl" commented-out because it's what the Link Checker uses. But it would probably make more sense to un-comment the block (in my case part of a BrowserMatch list) and just restore the # when I actually do check links, which is not often.
10:24 pm on Oct 27, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



just restore the # when I actually do check links, which is not often.


Same procedure I use for Xenu.
3:23 am on Oct 28, 2012 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



That link only gave 3 results for me, and the most interesting looking one couldn't be reached.


One of the SERPS provided an explanation of how to use the Amazon Cache with WordPress pages to offer links to product catalogs.

FWIW, there were six results in the search. No idea why you saw less.
5:34 am on Oct 28, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I got six results-- but the first two are from the same domain and came up "missing" in the browser, and no. 4 (the least promising of the batch, I was just being thorough) threw a "page load error".

You can get those first two in -- hahaha -- cached versions. But they are not very interesting or useful. The WordPress link seems to work.
From the image information within the product data, an image for each product is fetched and staged. This is again implemented in perl making use of wget.

By amazing coincidence I've also got "wget" blocked. Or rather "Wget"; don't know why only that form.
1:34 am on Oct 31, 2012 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Yea not liking the result about Product Image clouds at ALL!
I'd really hate to have to start water marking images again. GRRR.
 

Featured Threads

Hot Threads This Week

Hot Threads This Month