Forum Moderators: open

Message Too Old, No Replies

NokodoBot

doesn't obey robots.txt and prides itself on the fact

         

volatilegx

8:02 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



IP: 67.18.222.18
UA "NokodoBot/1.2 (+http://nokodo.com/bot.htm)"


Why isn't NokodoBot obeying our robots.txt file?

Nokodo's mission is to build a searchable index not relying on meta tags or meta descriptions only. Many webmasters (ab)use the robots.txt file to block and redirect (.htaccess) direct search engines to so called "optimized pages" in order to get "better results". We believe scanning actual pages delivers better results.

Just my opinion, but you can't defeat cloaking or optimizing techniques by ignoring robots.txt. Laughable!

Staffa

9:41 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Great find volatilegx.
Ever heard of such nonsense
I haven't seen them yet but they're blocked already, saving them the trouble to ignore my robots.txt ;o)

wilderness

11:43 pm on Nov 5, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have most every range of this provider denied as a result of numerous and/or excess unidentified crawling.
In addition the service provider offers co-location which some RIPE users are purchasing.

jdMorgan

12:11 am on Nov 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From the international bad-bots forum:
Why is NokodoBot considered a malicious user-agent?

Many Webmasters' mission is to build a user-friendly, well-organized site, with well-defined and non-confusing entry points, rather than having all pages of the site appear in search engine indexes. Many malicious robots (ab)use their priveleges to spider sites on the Web while disregarding robots.txt, and these must be banned by using server-side code to detect and control them by user-agent, by IP address, by proxy characteristics, and by behaviour.

Some robot authors are not aware of the concept of good citizenship implied by the Standard for Robots Exclusion, and must learn by being widely blocked as a result of gaining a bad reputation on forums like this one.


Jim

Sanenet

12:12 am on Nov 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmmm... Am I the only person to spot the logical flaws in their arguments?

Almost makes you want to setup a few cloaked pages for them, just so you can write them an email pointing out the gaps in their arguments :)

balam

1:04 am on Nov 6, 2004 (gmt 0)

10+ Year Member



67.18.222.20 = crawler.nokodo.com

wilderness

2:13 am on Nov 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



from ARIN
Search results for: 67.18.222.20
OrgName: ThePlanet.com Internet Services, Inc.
NetRange: 67.18.0.0 - 67.19.255.255

bcolflesh

2:52 am on Nov 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



OrgName: ThePlanet.com Internet Services, Inc.

I've yet to see anything good come from that "service".

JAB Creations

4:38 pm on Nov 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey wait a sec...those are the server my site is being hosted on! I've noticed a lot of stuff coming from them but never get a reply. Tell me more about these guys because I'm difinatly interested!

wilderness

6:02 pm on Nov 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



12/01/03
RewriteCond %{REMOTE_ADDR}
^69\.56\.(12[8-9]¦1[3-9][0-9]¦2[0-4][0-9]¦25[0-5])\. [OR]

69.56.150.214 - - [30/Nov/2003:22:44:21 -0800] "GET / HTTP/1.1" 200 9398 "-"
"qmhswdqw0vfhpgxmbywyyipnh"

2/12/04
SetEnvIf User-Agent ^VSE keep_out
deny from 69.93.

69.93.52.218 - - [12/Feb/2004:13:17:45 -0800] "GET /robots.txt HTTP/1.1" 200
2365 "-" "VSE/1.0 (vsecrawler@hotmail.com)"

8/01/04
RewriteCond %{REMOTE_ADDR} ^67\.18\.2(4 [0-9]¦5[0-5])\. [OR]

67.18.251.186 - - [01/Aug/2004:06:19:24 -0700] "GET /robots.txt HTTP/1.1"
206 2448 "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"

8/10/04
RewriteCond %{REMOTE_ADDR} ^12\.156\.[0-7]\. [OR]
RewriteCond %{REMOTE_ADDR} ^12\.160\.16[0-7]\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.185\.(9[6-9]¦1[01][0-9]¦12[0-7])\. [OR]
RewriteCond %{REMOTE_ADDR} ^69\.41\.2(2[4-9]¦[34][0-9]¦5[0-5])\. [OR]
RewriteCond %{REMOTE_ADDR} ^64\.5\.(3[2-9]¦[45][0-9]¦6[0-3])\. [OR]
RewriteCond %{REMOTE_ADDR}
^69\.56\.(12[8-9]¦1[3-9][0-9]¦2[0-4][0-9]¦25[0-5])\. [OR]
RewriteCond %{REMOTE_ADDR} ^69\.93\. [OR]
RewriteCond %{REMOTE_ADDR} ^70\.(8[45])\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.234\.2(2[4-9]¦[34][0-9]¦5[0-5])\. [OR]

8/11/04
Mozilla/3.01 (compatible; NPT 0.0 beta) 64.251.27.129 16/07
06:43

Infolink Information Services Inc. INFOLINK-BLK-100 (NET-64-251-0-0-1)
64.251.0.0 - 64.251.31.255
ServerPronto INMM-64-251-27-0 (NET-64-251-27-0-1)
64.251.27.0 - 64.251.27.255
another pest same source
69.56.130.138.