Forum Moderators: open
Some IPs
208.139.196.241
195.166.232.153
172.138.54.66 -> AC8A3642.ipt.aol.com
216.232.242.23 -> bkgk47gpy1ql.bc.hsia.telus.net
209.214.144.134 -> host-209-214-144-134.msy.bellsouth.net
208.12.29.237 -> host-29-237.dsl-sea.seanet.com
172.167.154.111 -> ACA79A6F.ipt.aol.com
209.86.200.206 -> user-38ldi6e.dialup.mindspring.com
63.102.245.89
66.47.165.130 -> user-112v9c2.biz.mindspring.com
63.102.245.89
66.26.167.26
12.84.111.118 -> 118.chicago-08rh16rt.il.dial-access.att.net
66.26.167.26 -> ilm26-167-026.ec.rr.com
208.12.29.237 -> host-29-237.dsl-sea.seanet.com
Obviously, they are covering their tracks here. They are either going through dialups or proxies. Perhaps the only way to track these guys down would be to contact the ISPs.
The referrer is:
[iaea.org...] -> very good chance it is completely unrelated to the bot. My guess is that it is being used to track the buzz on the spider activity, hence the weird referrer (which probably means they will be reading this). Also to circumvent the referrer based cloaking.
There is also a very good chance they are template sniffing.
UA -> Mozilla/3.0
uc.nombres.ttd.es (212.170.181.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
01 May -- 03:34:06 -- -- Code 404 Not Found
unresolved (148.78.254.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
20 Apr -- 22:04:20 -- -- /
qld.bigpond.net.au (61.9.208.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
27 Mar -- 19:20:45 -- -- Code 404 Not Found
[iaea.org...]
27 Mar -- 19:55:41 -- -- Code 404 Not Found
qld.bigpond.net.au (61.9.208.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
25 Mar -- 16:50:49 -- -- Code 404 Not Found
ipt.aol.com (172.173.82.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
25 Mar -- 12:39:29 -- -- /
unresolved (209.58.116.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
23 Mar -- 06:07:47 -- -- Code 404 Not Found
internetconnect.net (64.148.19.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
21 Mar -- 13:03:20 -- 00:10 -- /
[iaea.org...]
21 Mar -- 13:03:30 -- 00:37 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 13:04:07 -- 04:49 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 13:08:56 -- 00:27 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 13:09:23 -- -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:16:28 -- 00:15 -- /
[iaea.org...]
21 Mar -- 14:16:43 -- 00:30 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:17:13 -- 04:26 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:21:39 -- 00:32 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:22:11 -- -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:55:25 -- 00:18 -- /
[iaea.org...]
21 Mar -- 14:55:43 -- 00:30 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 14:56:13 -- 04:08 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 15:00:21 -- 00:30 -- Code 404 Not Found
[iaea.org...]
21 Mar -- 15:00:51 -- -- Code 404 Not Found
dsl.gtei.net (4.40.145.xxx) - Other Agent (Unknown Platform)
[iaea.org...]
17 Mar -- 22:02:05 -- -- /
[iaea.org...]
17 Mar -- 22:37:24 -- -- /
There is two of them, one at 1061 another at 1084.
Edited by: littleman
CEA.fr is, you guessed it, Commissariat à l'Energie Atomique. Sound Familiar?
Question, is it possible to use wildcards in UAs in robots.txt. With so many variations of Larbin around I would like to do
User-agent: *larbin*
Disallow: /
Does that work? It validates. (I'm on IIS so cannot htaccess)
Onya
Woz