Forum Moderators: open
137.208.51.20 Mozilla 7.1
- this one is spidering the entire site - takes several files at a time. not looked at robots.txt. check title bar on 137.208.51.20.
66.149.134.112 AnswerChase PROve 3.0
- browser or spider? only looked at root.
217.157.141.149 IE 5.5 Compatible Browser
- browser or spider? IP gives log in boxes. UA looks dodgy. has looked at root twice in a month without looking at anything else.
65.10.243.48 Java1.2.2
- IP shows Server Error. has only looked at root.
24.114.226.88 TulipChain/4.0 (http://ostermiller.org/tulipchain/) Java/1.3.1 (http://java.sun.com/) Windows_2000/5.1
- appears to be a browser. look legitimate, but is it?
134.126.252.244 Xenu Link Sleuth 1.2a
- never seen this one before
130.225.20.8 Marvin v0.5
- no idea what this is
212.187.171.147 Crawler
- looked at robots.txt
63.173.190.16 Mozilla/4.7 (compatible; WhizBang)
- checked for robots.txt. also had OPTIONS showing in place of GET or HEAD. took every page of the site.
63.212.171.162 Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux-gnu)
64.210.196.195 -
- checked robots.txt. no user agent.
195.141.85.142 search.ch V1.4.2 (spiderman@search.ch; [search.ch)...]
- checked robots.txt
That's a tool for ODP editors (privately operated), which helps them to maintain their categories, such as checking for duplicates, finding broken links, etc. The operation is semi-automatic, ie. there's always a human editor triggering the process. If you want that editor to take another close and critical look at your site, go ahead and block TulipChain from your site... ;)
130.225.20.8 Marvin v0.5
- no idea what this is
That's Northernlight's crawler. Be nice to him, he suffers from depressions.
there are a few more to come as and when i get time to dig through log files. thanks to all for digging up the info about the ones i've found so far. is anyone maintaining an up to date spider list with the kind of information about them that you guys are posting here?
"Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) TrueRobot; 1.5"
195.101.94.208
195.101.94.0 - 195.101.94.255
netname: FR-ECHO
descr: Socite ECHO
country: FR
admin-c: CR308-RIPE
tech-c: CR308-RIPE
status: ASSIGNED PA
notify: addr-reg@rain.fr
mnt-by: RAIN-TRANSPAC
changed: noc@rain.fr 19970514
source: RIPE
Additionally I also have some RoadRunnner Ip's denied.
Today a RR user was denied and within five seconds came back under 216.144.64.?
It must have been an automatic retry by the server as the user tried again a few minutes later an the retry to the 216 wouldn't work.
found this in my logs. what exactly is this?
a spammer attempting to gain email addresses or what?
i note that ther eis a similar post here regarding this, and its 1 ip off.
if anyone can give me an insight to this id appreciate it.
also, on another note.
our website has not yet gone live, yet it is unnundated with hits from rr.com addresses, in the month of april it has had 900 hits ( 2 days!)
why and what do they want? spammers in action, or rr users scanning for boxen to exploit...?
This link expalins the Sqworm
[webmasterworld.com...]
that IP 63.212.171.171 is the same as the one you specify (63.212.171.163)it is not unusual for the last block to vary.
That IP is part of Level 3;
Level 3 Communications, Inc. (NETBLK-LEVEL4-CIDR)
1450 Infinite Drive
Louisville, CO 80027
US
Netname: LEVEL4-CIDR
Netblock: 63.208.0.0 - 63.215.255.255
Maintainer: LVLT
RR.com= Road Runner.
Road Runner does have their own search engine.
However you might keep in mind that all of the following are associated :-(
About.com, Global Crossing, RR and Thunderstone.
If a particular IP from RR is hitting your pages and you do not desire it?
Just "deny access" in your htaccess file.
Use the first three blocks of that IP and end the third block with "." which will deny the entire fourth 255 range.
According to :
[clearwaterbeachcam.com...]
SQworm is an :
AOL Search / Pacific Internet Exchange robot
Which makes sense because that corresponds with the AOL'er who was visiting me.
Thanks everyone for your help and excellent posts.