Forum Moderators: open

Message Too Old, No Replies

Unknown spider with unsavoury name

f*****bot larbin2.5.0@unspecified.mail

         

shelleycat

5:28 am on Aug 13, 2002 (gmt 0)

10+ Year Member



I've found a spider in my log files during the last week or so which has a rather nasty name (I edited it above because I'm not sure about posting it here). I've read a bit about the larbin part which appears to be a downloadable programme, but have no idea how to find out anything else. From my log files (again with a slight edit):

157.159.10.14 - - [06/Aug/2002:19:54:27 +1200] "GET /robots.txt HTTP/1.0" 404 645 "-" "f****bot (larbin2.5.0@unspecified.mail)"
157.159.10.14 - - [06/Aug/2002:19:56:25 +1200] "GET /blog/ HTTP/1.0" 200 24527 "-" "f****bot larbin2.5.0@unspecified.mail"

I'm also not sure how to block it. It did look at robots.txt so I'm assuming I can put something in there? I can follow the basics of how to disallow things but don't know what to call this specific spider so it gets stopped and nothing else does.

I also know I can use .htaccess but when I tried uploading one follwing my hosting companies instructions it gave me an internal server error. I haven't had time to try again since so don't konw what the problem was (I know about saving as unix etc).

The only reason I really have for disallowing it is that I don't see how anything professional can be using a name such as this, and I just don't really want them crawling around inside my domain.

So I'm looking for advice as to how I find out a bit more about this unsavoury annoyance and advice on how to tell them to go away.

Sinner_G

6:30 am on Aug 13, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



LOL, from looking at the IP, this seems to come from the french 'National Institute for Telecommunications', a government agency :).

shelleycat

11:02 am on Aug 13, 2002 (gmt 0)

10+ Year Member



I did a whois lookup on the IP and I got that also. But I wasn't sure if I was looking up the right thing since I've never done this kind of whois look up before. So, some bored government employee possibly? :)

I've looked at the robots.txt file for webmasterworld before but only noticed the robots2, robots3 and robots4 versions today. So I've made a few alterations to the robots4.txt (which allows a list of specific spiders and disallows everything else) and uploaded that. Hopefully it will stop annoying bugs with nasty names creeping through my files.

Shelley