Forum Moderators: open
It didn't request robots.txt. Not sure if it needs to. I don't even know why Amazon would be interested in any of my websites. I don't sell anything online.
There's some old threads on this.
One group participant suggested to me that "they need to get their reference material and provide links to same and it may as well be my pages", (or something very similar.
After reading his words a couple of times, I agreed and accepted his explanation as a benefit for driving traffic to my pages.
Don
What I still don't understand though is why they would use such a nondescript user agent? It gives the appearance of trying to be a snoop of sorts. At least that's how I see it. What about you?
[google.com...]
Here's the other members comment which I agree with:
[webmasterworld.com...]
We (participants in this forum) see some log entries (which most webmasters don't even read) that are dubious and some what unethical in how they appear or present themselves.
I guess it's the same for vistors.
A vsitor starts out surfing the WWW grabbing everything in site as they become fascinated with the internet.
There's no rule book to advise this person on integrity in his actions and IF there was such an FAQ, he/she wouldn't pause to read it anyway.
When these types of visitors are denied access by some htaccess option, they haven't a clue why. Neither are they even aware that they have done anything wrong. They violated our code of ethics, however they haven't commited a crime (at least not a felony or some other serious ethical breach that would interrupt their activities or lives.)
My entire point is that these bots don't believe they are doing anything wrong or mischievious.
As much as we rant and rave, the pests and their methods will continue.
I'm for looking for (and accepting) new ways to make these methods work to our benefit?
Not sure how?
I really see no harm in Amazon, unless they take to deep-linking to images or framing my pages!
Rather the exposure of my content on their pages may only increase my traffic.
Don
1.) Amazon owns A9.com and Alexa.com search engines. I don't have a site-related connection to anything Amazonian, and block A9 and Alexa -- yet I see these kinds of hits on a regular basis --
iad-fw-global.amazon.com - - [05/Jun/2006:19:14:30 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [05/Jun/2006:19:22:53 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [05/Jun/2006:19:23:01 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [05/Jun/2006:19:38:50 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [05/Jun/2006:19:39:10 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [15/Jun/2006:12:21:50 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [17/Jun/2006:15:04:45 -0700] "GET /favicon.ico HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [17/Jun/2006:16:16:28 -0700] "GET /favicon.ico HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [18/Jun/2006:09:06:10 -0700] "GET /favicon.ico HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [19/Jun/2006:06:45:40 -0700] "GET /favicon.ico HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [19/Jun/2006:08:27:50 -0700] "GET /favicon.ico HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
iad-fw-global.amazon.com - - [27/Jun/2006:20:22:56 -0700] "GET /siteinfo.xml HTTP/1.1" 403 815 "-" "Java/1.5.0_06"
(I block all Java thus the 403s.)
Nary a robots.txt request from "iad-fw-global.amazon.com" EVER. And I don't have, never have had "siteinfo.xml" -- so why keep looking for it? Practically a ping...
2.) Coincidentally, apparently 'real' users look like the following. The first is a Firefox check for robots.txt (probably the Fastfox extention), then ~90 minutes later, the exact same IP (.xyz = obfuscated) but a different version of FF, browsing via a Google search:
207-171-180-xyz.amazon.com - - [20/Jun/2006:13:19:46 -0700] "GET /robots.txt HTTP/1.1" 200 11453 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:11 -0700] "GET /dir1/file1.html HTTP/1.1" 200 65337 "http://www.google.com/search?hl=en&sa=X&oi=spell&resnum=0&ct=result&cd=1&q=keyword1+keyword2&spell=1" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:11 -0700] "GET /dir2/file1.gif HTTP/1.1" 200 35736 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:12 -0700] "GET /dir2/file2.gif HTTP/1.1" 200 43 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:12 -0700] "GET /dir2/file3.gif HTTP/1.1" 200 9097 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:12 -0700] "GET /dir2/file4.gif HTTP/1.1" 200 3627 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:12 -0700] "GET /dir2/file5.gif HTTP/1.1" 200 43 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:12 -0700] "GET /dir2/file6.gif HTTP/1.1" 200 6079 "http://www.example.com/dir1/file1.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
207-171-180-xyz.amazon.com - - [30/Jun/2006:14:52:13 -0700] "GET /favicon.ico HTTP/1.1" 200 2238 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2"
3.) Point is, not all Amazon hits are bots, and not all Amazon bots are bad BUT the top set of hits ARE Amazon and they ARE bad.
.
P.S.
Tinyurl.com is great for super-wide, side-scrolling URLS.