Forum Moderators: open
Why isn't NokodoBot obeying our robots.txt file?Nokodo's mission is to build a searchable index not relying on meta tags or meta descriptions only. Many webmasters (ab)use the robots.txt file to block and redirect (.htaccess) direct search engines to so called "optimized pages" in order to get "better results". We believe scanning actual pages delivers better results.
Just my opinion, but you can't defeat cloaking or optimizing techniques by ignoring robots.txt. Laughable!
Why is NokodoBot considered a malicious user-agent?Many Webmasters' mission is to build a user-friendly, well-organized site, with well-defined and non-confusing entry points, rather than having all pages of the site appear in search engine indexes. Many malicious robots (ab)use their priveleges to spider sites on the Web while disregarding robots.txt, and these must be banned by using server-side code to detect and control them by user-agent, by IP address, by proxy characteristics, and by behaviour.
Some robot authors are not aware of the concept of good citizenship implied by the Standard for Robots Exclusion, and must learn by being widely blocked as a result of gaining a bad reputation on forums like this one.
69.56.150.214 - - [30/Nov/2003:22:44:21 -0800] "GET / HTTP/1.1" 200 9398 "-"
"qmhswdqw0vfhpgxmbywyyipnh"
2/12/04
SetEnvIf User-Agent ^VSE keep_out
deny from 69.93.
69.93.52.218 - - [12/Feb/2004:13:17:45 -0800] "GET /robots.txt HTTP/1.1" 200
2365 "-" "VSE/1.0 (vsecrawler@hotmail.com)"
8/01/04
RewriteCond %{REMOTE_ADDR} ^67\.18\.2(4 [0-9]¦5[0-5])\. [OR]
67.18.251.186 - - [01/Aug/2004:06:19:24 -0700] "GET /robots.txt HTTP/1.1"
206 2448 "-" "Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)"
8/10/04
RewriteCond %{REMOTE_ADDR} ^12\.156\.[0-7]\. [OR]
RewriteCond %{REMOTE_ADDR} ^12\.160\.16[0-7]\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.185\.(9[6-9]¦1[01][0-9]¦12[0-7])\. [OR]
RewriteCond %{REMOTE_ADDR} ^69\.41\.2(2[4-9]¦[34][0-9]¦5[0-5])\. [OR]
RewriteCond %{REMOTE_ADDR} ^64\.5\.(3[2-9]¦[45][0-9]¦6[0-3])\. [OR]
RewriteCond %{REMOTE_ADDR}
^69\.56\.(12[8-9]¦1[3-9][0-9]¦2[0-4][0-9]¦25[0-5])\. [OR]
RewriteCond %{REMOTE_ADDR} ^69\.93\. [OR]
RewriteCond %{REMOTE_ADDR} ^70\.(8[45])\. [OR]
RewriteCond %{REMOTE_ADDR} ^216\.234\.2(2[4-9]¦[34][0-9]¦5[0-5])\. [OR]
8/11/04
Mozilla/3.01 (compatible; NPT 0.0 beta) 64.251.27.129 16/07
06:43
Infolink Information Services Inc. INFOLINK-BLK-100 (NET-64-251-0-0-1)
64.251.0.0 - 64.251.31.255
ServerPronto INMM-64-251-27-0 (NET-64-251-27-0-1)
64.251.27.0 - 64.251.27.255
another pest same source
69.56.130.138.