Forum Moderators: open

Message Too Old, No Replies

TMCrawler

reads robots.txt, and that's it...

         

bobothecat

7:22 pm on Sep 29, 2006 (gmt 0)



so either it understands 403's... or it'll come back later. IP hails from Taiwan:

59.125.116.4 - - [29/Sep/2006:13:17:41 -0600] "GET /robots.txt HTTP/1.0" 403 292 "-" "TMCrawler"

GaryK

4:42 am on Sep 30, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've setup a special watch for user agents that only hit robots.txt. That's because I thought I was spotting a trend where a ua only gets robots.txt and then returns days, weeks or months later without bothering to read robots.txt again and usually disrespects robots.txt. Turns out there does appear to be a trend. When I've got enough data to present a convincing case I'll publish it, but I recommend you keep an eye on these types of visits. :)