Forum Moderators: open

Message Too Old, No Replies

a few new ips and UAs

         

Crazy_Fool

9:15 pm on Jan 10, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if anyone's interested, i've found a few new IPs and UAs in one log file this month:

137.208.51.20 Mozilla 7.1
- this one is spidering the entire site - takes several files at a time. not looked at robots.txt. check title bar on 137.208.51.20.

66.149.134.112 AnswerChase PROve 3.0
- browser or spider? only looked at root.

217.157.141.149 IE 5.5 Compatible Browser
- browser or spider? IP gives log in boxes. UA looks dodgy. has looked at root twice in a month without looking at anything else.

65.10.243.48 Java1.2.2
- IP shows Server Error. has only looked at root.

24.114.226.88 TulipChain/4.0 (http://ostermiller.org/tulipchain/) Java/1.3.1 (http://java.sun.com/) Windows_2000/5.1
- appears to be a browser. look legitimate, but is it?

littleman

9:25 pm on Jan 10, 2002 (gmt 0)



Thanks Crazy_Fool.

Crazy_Fool

1:25 am on Jan 11, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



here's a few more i've had since about mid november. apologies if you've seen them before - i haven't had time to search here for more information on them.

134.126.252.244 Xenu Link Sleuth 1.2a
- never seen this one before

130.225.20.8 Marvin v0.5
- no idea what this is

212.187.171.147 Crawler
- looked at robots.txt

63.173.190.16 Mozilla/4.7 (compatible; WhizBang)
- checked for robots.txt. also had OPTIONS showing in place of GET or HEAD. took every page of the site.

63.212.171.162 Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux-gnu)

64.210.196.195 -
- checked robots.txt. no user agent.

195.141.85.142 search.ch V1.4.2 (spiderman@search.ch; [search.ch)...]
- checked robots.txt

volatilegx

4:58 pm on Jan 11, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Crazy_Fool, I enjoyed checking on these. The only thing I really came up with was on that last one, search.ch, which is a Swiss search engine I didn't know about before.

mbauser2

7:04 pm on Jan 25, 2002 (gmt 0)

10+ Year Member



"Xenu Link Sleuth" is a freeware link validator. Here's the homepage URL: [home.snafu.de...]

Marcia

12:25 pm on Jan 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



80.60.35.143 - - [26/Jan/2002:04:11:56 -0500] "GET /robots.txt HTTP/1.0" 200 1380 "-" "appie 1.1 (www.walhello.com)"

bird

2:28 pm on Jan 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



24.114.226.88 TulipChain/4.0
(http://ostermiller.org/tulipchain/) Java/1.3.1
(http://java.sun.com/) Windows_2000/5.1

That's a tool for ODP editors (privately operated), which helps them to maintain their categories, such as checking for duplicates, finding broken links, etc. The operation is semi-automatic, ie. there's always a human editor triggering the process. If you want that editor to take another close and critical look at your site, go ahead and block TulipChain from your site... ;)

130.225.20.8 Marvin v0.5
- no idea what this is

That's Northernlight's crawler. Be nice to him, he suffers from depressions.

Key_Master

4:52 pm on Jan 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



User agent "Guliver" is Northernlight's crawler. Marvin v.05 came from this Danish site [uni-c.dk].

bird

7:11 pm on Jan 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Awwww, I need a memory upgrade.

Crazy_Fool

1:28 am on Jan 27, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



and here's another ...
211.218.151.108 nabot_1.0

there are a few more to come as and when i get time to dig through log files. thanks to all for digging up the info about the ones i've found so far. is anyone maintaining an up to date spider list with the kind of information about them that you guys are posting here?

scareduck

9:23 pm on Jan 29, 2002 (gmt 0)

10+ Year Member



My humble opinion is that

Java[0-9.]+

is a desktop Java-based bot. It may or may not be well-behaved (i.e., adhere to the Robots Exclusion Protocol). This is actually pretty old (we've seen these for months now).

Crazy_Fool

2:08 pm on Jan 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



they keep on coming ....
212.123.67.70 Ideare+-+SignSite/1.2 - -

TallTroll

3:20 pm on Jan 30, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> 217.157.141.149 IE 5.5 Compatible Browser

I've got a trace for this one, the machine name I get back for it is port144.dsl-hhl.adsl.cybercity.dk

www.cybercity.dk looks as if its an ISP marketing site, but I don't read Danish :)

snark

1:15 am on Jan 31, 2002 (gmt 0)

10+ Year Member



By the way, Northern Light is no longer going to be a public search engine -- you have to pay to read their search results. And they just got bought out by divine.

snark

Crazy_Fool

1:20 am on Jan 31, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi snark, welcome to WebmasterWorld
you'll find people here are quick off the mark - the NL / Divine deal was mentioned a few days back:
[webmasterworld.com...]

snark

1:24 am on Jan 31, 2002 (gmt 0)

10+ Year Member



Hi Crazy_Fool!

Yeah -- about 3 seconds after I posted this, I saw the other postings in the other forum. (Sigh!) :)

Thanks for the welcome!

Cheers,
snark

wilderness

3:58 am on Jan 31, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Had a new one today

"Mozilla/4.0 (compatible; MSIE 5.0; Windows 95) TrueRobot; 1.5"
195.101.94.208

195.101.94.0 - 195.101.94.255
netname: FR-ECHO
descr: Socite ECHO
country: FR
admin-c: CR308-RIPE
tech-c: CR308-RIPE
status: ASSIGNED PA
notify: addr-reg@rain.fr
mnt-by: RAIN-TRANSPAC
changed: noc@rain.fr 19970514
source: RIPE

Additionally I also have some RoadRunnner Ip's denied.
Today a RR user was denied and within five seconds came back under 216.144.64.?
It must have been an automatic retry by the server as the user tried again a few minutes later an the retry to the 216 wouldn't work.

mark

3:03 am on Apr 3, 2002 (gmt 0)



63.212.171.163 - - [02/Apr/2002:05:52:16 -0500] "GET / HTTP/1.0" 200 161 "-" "Sqworm/2.9.85-BETA (beta_release; 20011115-775; i686-pc-linux-gnu)"

found this in my logs. what exactly is this?
a spammer attempting to gain email addresses or what?
i note that ther eis a similar post here regarding this, and its 1 ip off.
if anyone can give me an insight to this id appreciate it.
also, on another note.
our website has not yet gone live, yet it is unnundated with hits from rr.com addresses, in the month of april it has had 900 hits ( 2 days!)
why and what do they want? spammers in action, or rr users scanning for boxen to exploit...?

wilderness

5:11 am on Apr 3, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hey Mark,
Welcome.
Your asking more than one question here.

This link expalins the Sqworm
[webmasterworld.com...]
that IP 63.212.171.171 is the same as the one you specify (63.212.171.163)it is not unusual for the last block to vary.

That IP is part of Level 3;

Level 3 Communications, Inc. (NETBLK-LEVEL4-CIDR)
1450 Infinite Drive
Louisville, CO 80027
US
Netname: LEVEL4-CIDR
Netblock: 63.208.0.0 - 63.215.255.255
Maintainer: LVLT

RR.com= Road Runner.
Road Runner does have their own search engine.
However you might keep in mind that all of the following are associated :-(
About.com, Global Crossing, RR and Thunderstone.

If a particular IP from RR is hitting your pages and you do not desire it?
Just "deny access" in your htaccess file.
Use the first three blocks of that IP and end the third block with "." which will deny the entire fourth 255 range.

cdiggins

8:08 pm on Apr 10, 2002 (gmt 0)



Found out finally what the sqworm robot is after some concern.

According to :
[clearwaterbeachcam.com...]

SQworm is an :
AOL Search / Pacific Internet Exchange robot

Which makes sense because that corresponds with the AOL'er who was visiting me.

Thanks everyone for your help and excellent posts.

jdMorgan

11:35 pm on May 4, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I realize that this thread is played out, but I found it while searching for "nabot 1.0", hoping to get an ID on it.
I chased the IP address I had logged, plus the one listed above to the "Central Data Communication Office" of Korea Telecom, so it looks like a legitimate robot run by KT. It does look at robots.txt, but so far has only taken index.html. I'll update this post if it does NOT honor robots.txt.