Welcome to WebmasterWorld Guest from 126.96.36.199 , register , free tools , login , search , pro membership , help , library , announcements , recent posts , open posts Pubcon Gold Sponsor 2015!
Does amazon have a crawler or are all Amazon AWS IPs fair game? bigtoga
There is no valid reason that someone should hit my site from an amazon AWS box. I'd love to just block the whole of amazon AWS/etc straight at the firewall. But I'm worried that I'll somehow block an amazon crawler/spider and that would possibly impact the sales I do on amazon. Anyone have any suggestions/links for this sort of thing? I want to allow, if it exists, the actual amazon company to browse the site but block amazon's AWS/etc customers who spin up a server then scrape/spam with it.
Pfui has a long and dedicated thread [ webmasterworld.com] bigtoga
Yes, I've seen that - thank you. I'm not sure though whether Amazon has its own crawler? wilderness
Nor am I, unfortunately their hosting business customers have a proven record of abuse, as does Amazon AWS' acceptance of these customers. Perhaps the Amazon FAQ (NOT Amazon AWS) provides the answer. The easiest explanation is within your raw visitor logs and the image references to your own Amazon pages. What are those IP's? Simply separate them from the Amazon AWS IP's. keyplyr
It used to be call "A1" but haven't seen that UA for a while. Then there were versions of "AWSpider" (AWSpider 0.3.2.12 last hit my logs in 2011.)
I've busted several amazon "bots" scraping our site for images.... makes me wonder if they are actually stealing product images from sites for their own use. wilderness
bots don't exactly leave resumes ;) Harvesting, plagiarizing or simply indexing, who knows the why? The AWS customers hit us all, that's why the long threads exist. keyplyr
The AWS customers hit us all, that's why the long threads exist. Yes, but we're discussing "Amazon" bots. wilderness
My bad, hope it's just a full moon ;)