Forum Moderators: open

Message Too Old, No Replies

nicebot

Not so nice...

         

bobothecat

12:31 pm on Jun 21, 2005 (gmt 0)



Doesn't request robots.txt

64.251.30.22 - - [21/Jun/2005:04:55:13 -0600] "GET / HTTP/1.1" 403 318 "-" "nicebot"

fiestagirl

10:51 pm on Jun 22, 2005 (gmt 0)

10+ Year Member



I've seen it coming from:
64.251.30.18-21 and 69.60.120.167-168
no robots.txt

GaryK

7:19 pm on Jun 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Mine came from 69.60.120.171 on 20 June, requested robots.txt, obeyed it, but left after crawling only two pages.

Perhaps there's a user setting to control whether it reads/obeys robots.txt?

Does anyone have a URL with more information on this bot?

EDIT:

I hope it's alright to post these URLs. They are relevant to tracking down the source of this bot.

I found this reference to nicebot: [cszone.ru...] It referred to another domain name, www.cs.ab.ru, but I can't seem to find an entrance page for the site.

I've got a Russian friend of mine working on a translation for the page at the first URL. The machine translation from Babel Fish was truly horrible!

GaryK

9:58 pm on Jun 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sorry to double-post but it was too late to edit my previous message.

As I understand the explanation of the Russian translation this is an online game of some sort that sends out bots with a user agent named nicebot.

Here's the proper translation from the link I posted [emphasis is mine]:

Once again [name of company] releases a new version of NiceBot 1.3. The most important changes in the new version: new commands in NiceWeapons.cfg, support of CS1.6, corrected some bugs with names of bots, the bug with "moveandshoot"-was corrected, improved navigation and "waypoints" on old cards and added new cards (look below), new console commands, new skill in NiceSkill.cfg, enemy search system is improved, new commands of management for Dedicated servers were added, the bug with use of "secondary attack" (the muffler, etc) was corrected. The bug with NiceBot.cfg and bug with lag of bots in the beginning of a round were corrected, " Follow Me " system is improved (now the bot will follow you even if you will disappear from a view)-new commands for editing of waypoints, Anti-terrorists now can protect the dropped out bomb. Terrorists can protect hostages + a number of fine glitches were corrected. Download is available at [some URL that doesn't work]

Do you all think I'm on the right track with this one?

Lord Majestic

10:10 pm on Jun 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Its wrong track -- I am a native Russian speaker so I can be sure about it: the page you linked to refers to bot in Counter-Strike 1.6 (CS1.6). Its a very popular online game (Half Life mod actually) featuring terrorists and counter-terrorists/

larryhatch

10:39 pm on Jun 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nicebot paid 4 visits in 24 hours.
Widely spaced in time, all calling the same page, a lengthy bibliography.

69.60.120.167 24JUN05 23:02:37 /SOURCE 54653 nicebot
64.251.30.23 25JUN05 01:17:24 /SOURCE 54653 nicebot
64.251.30.22 25JUN05 15:12:48 /SOURCE 54653 nicebot
69.60.120.166 25JUN05 19:07:45 /SOURCE 54653 nicebot

With low bandwidth demand, I didn't much notice.
No calls for robots.txt of course, just the above.

The 69.60.. DNS #s trace back to: Infolink Information Services Inc. IIS-129:
2400 E Las Olas Blvd. Fort Lauderdale, FL
.. while the 64.251.. #s grace back to: ServerPronto
2400 E. Las Olas Blvd. Fort Lauderdale, FL
Note identical addresses.

In a separate message 2 days ago, I wrote about 'LinkSiphon'
trolling my site from a different town, but also in Florida.
LinkSiphon did a heavy crawl, half my html pages. -Larry

GaryK

4:30 am on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thank you, Lord Majestic. I am woefully ignorant when it comes to computer games. When I want a break from working (web app developer) the last thing I want to do is spend more time sitting at the computer. ;)

Larry, I noticed the same address similarity. I'm only 30 minutes away from the Las Olas address. Maybe it's worth a drive?

larryhatch

5:06 am on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi GaryK:

Before you waste gas, I think I found their website: [infolink.com...]

It looks like a high-end ISP, high-volume servers and connections etc.
More than likely, the "nicebot" user is one of their clients, and probably
a slippery operation, note they keep changing DNS #s within infolink. -Larry

GaryK

5:46 am on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There's not much sense in visiting a data center. ;)

I've added nicebot to my browscap.ini and related files as a Website Stripper which means I think it should be banned. If nothing else IMO it should be banned for not including a URL for webmasters in the user agent.

It also includes nicebot in my httpd.ini file for ISAPI_Rewrite on Windows which forcibly bans them with a permanent redirect.

I've love to produce the *nix equivalent of my httpd.ini file but I lack knowledge of the syntax.

kgun

12:23 pm on Jun 27, 2005 (gmt 0)



Here I am on thin ice. Do you talk about configuring the web server? Is that the best way to block spiders and advanced site crawlers? Is it possible?

Do you have a good reference, book or internet site?

KBleivik
Make it simple, as simple as possible, but no simpler.

volatilegx

1:39 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



kgun,

Here is a very old post that is still valid today. It gives an example of using Apache's Mod_Rewrite module to ban bad bots.

A Close to perfect .htaccess ban list [webmasterworld.com]

kgun

4:27 pm on Jun 27, 2005 (gmt 0)



volatilegx

Thank you very much. I have noted your site and linked to it some time ago. Think you have a great resource. Tell me if you want the link deleted.

KBleivik
Make it simple, as simple as possible, but no simpler.

markwelch

7:23 pm on Jun 30, 2005 (gmt 0)

10+ Year Member



Nicebot popped in on my site today, and of course there was no check for robots.txt -- note that these three requests came from three different IPs.

2005-06-30 09:08:01 W3SVC459944359 GET /movies/Jealousy.htm - 80 - 69.60.120.166 nicebot - - www.example.com 200 * * *

2005-06-30 10:34:37 W3SVC459944359 GET /movies/Miranda.htm - 80 - 69.60.120.173 nicebot - - www.example.com 200 * * *

2005-06-30 19:15:50 W3SVC459944359 GET /movies/Forbidden_Cargo.htm - 80 - 64.251.30.23 nicebot - - www.example.com 200 * * *

[edited by: volatilegx at 7:34 pm (utc) on July 1, 2005]
[edit reason] removed specifics [/edit]