Page is a not externally linkable
carfac - 3:37 pm on Sep 7, 2002 (gmt 0)
I have a "spider trap" on my site, a file that is excluded in the robots.txt, but crawlers that ignore the robots.txt will run... and if run, it logs their IP, and bans them... So, look at this: 216.239.33.5 - - [07/Sep/2002:06:18:24 -0600] "GET / HTTP/1.0" 200 12614 "-" "SIE-C3I/3.0 UP/4.1.16m (Google WAP Proxy/1.0)" That is all it got, but it was enough to ban him! Should I unban this IP, contact Google, anything like that? dave
Looking over last nights logs, I found a google bot (a WAP bot, it looks like) that ignored my robots.txt.
216.239.33.5 - - [07/Sep/2002:06:19:12 -0600] "GET /secret_spider-trap.cgi HTTP/1.0" 200 152 "-" "SIE-C3I/3.0 UP/4.1.16m (Google WAP Proxy/1.0)"