Forum Moderators: open
133.11.36.25 - - [30/Apr/2003:11:58:40 -0700] "GET / HTTP/1.1" 200 20114 "-" "Zao/0.1 (http*//www.kototoi.org/zao/)"
Mentioned in this blocked thread [webmasterworld.com] from Oct 29, 2002.
The site <Last updated on July 18, 2002> says it's purpose is for studying how to 'collect documents' and how to 'extract information out of the collected documents'. I have to wonder how they define 'documents'.
I take it this bot hasn't been around for awhile, or just that it hasn't visited me before?
My work is more along the lines of a directory and I was wondering how they'd equate a bunch of links to their interpretations of what the bots job is. Links don't seem to be the same as 'documents', unless I'm missing something.
Pendanticist.
Reverse DNS gave - hibari01.crawler.kototoi.org
The referring page was very good and informative. It will be interesting to see what happens with this.
133.11.36.34
- got hit yesterday, a very slow crawler, 10 to 20 minutes between fetching each page - no referrers, using GET and grabbing it all.
First file request was robots.txt, then directly off to a level 2 index page, to a level 3 page in same group, but not via link from former page, and then a level 3 page in another group. Main index page was not requested at all.
(the levels and groups follows Bretts site model - i've discovered that this site actually follows that model, althought i didn't know the model until a few days ago)
/claus
[webmasterworld.com...]
specifically the part about reducing the instance of 'Disallow: /' to one and simply listing the bots included in the disallow Zao no longer seems to get it so it's gone into the ban bin in htaccess.