some interesting reading in two of the serps [google.com]
FWIW, this UA (libwwwperl) is one that most every beginner blacklists, and nothing suggests a difference of opinion from that first impression
That link only gave 3 results for me, and the most interesting looking one couldn't be reached.
I thought I had libwwwperl blacklisted, but it turns out, I (sort of) did:
SetEnvIfNoCase User-Agent "libwww-perl/" bad_bot
I will be updating the entry for the new version!
Unless you have particular need to let SOME amazon IPs into your site, block ALL amazon IP ranges. There are a few lists of them in this forum - go for the latest.
:: detour to own htaccess ::
Heh. I've got "libwww-perl" commented-out because it's what the Link Checker uses. But it would probably make more sense to un-comment the block (in my case part of a BrowserMatch list) and just restore the # when I actually do check links, which is not often.
|just restore the # when I actually do check links, which is not often. |
Same procedure I use for Xenu.
|That link only gave 3 results for me, and the most interesting looking one couldn't be reached. |
One of the SERPS provided an explanation of how to use the Amazon Cache with WordPress pages to offer links to product catalogs.
FWIW, there were six results in the search. No idea why you saw less.
I got six results-- but the first two are from the same domain and came up "missing" in the browser, and no. 4 (the least promising of the batch, I was just being thorough) threw a "page load error".
You can get those first two in -- hahaha -- cached versions. But they are not very interesting or useful. The WordPress link seems to work.
|From the image information within the product data, an image for each product is fetched and staged. This is again implemented in perl making use of wget. |
By amazing coincidence I've also got "wget" blocked. Or rather "Wget"; don't know why only that form.
Yea not liking the result about Product Image clouds at ALL!
I'd really hate to have to start water marking images again. GRRR.