Forum Moderators: open
IP's:
140.239.251.222 - Direct Hit Technologies
65.214.36.53 - Ask Jeeves, Inc.
UA: Mozilla/2.0 (compatible; Ask Jeeves)
They only get the contact and root pages.
I remember that's one of the direct hit spiders, however, it is sending a referer string (www.ask.com/). Thought this was strange. Anybody else?
The point I'm making is that the first two are sending a string in HTTP_REFERER. And the ips seem to be new to my site also. The only other spider that I've seen that sends an HTTP_REFERER string is picsearch - that referer is always '-'.
Here are a few others that send specific HTTP_REFERRER strings:
Robozilla (dmoz.org)
Netcraft (the "server software stats" people - try saying that after a few shandies, netcraft.com)
TulipChain (ostermiller.org/tulipchain)
PingaLink (pingalink.com)
There are also some as-yet-unidentified bots which also frequently send referrers like "synd.looksmart.co.uk".
Most of the big search engines normally send a string in the HTTP_FROM field (including Google,Lycos,Inktomi,Altavista,Excite,AlltheWeb,NorthernLight).
That is, the bots that make themselves obvious do, in any case.
64.55.148.37-9
64.55.148.43-5
64.55.148.50-4
140.239.251.230
207.204.132.233-4
208.178.104.55
209.67.252.197
209.67.252.199
209.67.252.211-6
216.34.121.18-9
216.34.121.31-4
216.34.121.67
216.34.121.100
216.200.130.20,26,77-9,85-9,200-8,242,244-6,248-9
Not all in active use at present as far as I know.
HTH
'Most of the big search engines normally send a string in the HTTP_FROM field (including Google,Lycos,Inktomi,Altavista,Excite,AlltheWeb,NorthernLight)'
HTTP_FROM I'm not tracking.
Exactly what string do they send? This could be useful, I'd like to log those differently from regular hits.
bob - typically, agents send either a URL or email address in the FROM header. According to the robot guidelines, it is intended as a means of contacting the operator, so I'd say an email address is probably more desirable (provided they bother to maintain it). For example, Excite has "spider@atext.com" and Google has both a URL and an email address.