Forum Moderators: open

Message Too Old, No Replies

ICC-Crawler

         

lucy24

5:51 pm on Aug 20, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Most recent mention I can find: June 2016 [webmasterworld.com]
IP: 202.180.34.abc
UA: ICC-Crawler/2.0 (Mozilla-compatible; ; http://ucri.nict.go.jp/en/icccrawler.html)
Headers: humanoid
robots.txt: YES
Request pattern--selected pages within a single directory--suggests a shopping list provided by an outside source. Several very short visits over a period of 1-2 days, each beginning with robots.txt, all from the identical IP.

Some details of the UA have changed over the years, but the distinctive semicolon pattern--which would get them blocked on some sites (looking at you, wilderness)--remains the same. And, unlike behavior reported in 2016, they DO ask for robots.txt. Since all their requests were for files in one directory, I can’t absolutely swear that they are compliant; I tend to doubt that the shopping list included other material from roboted-out directories which they decided against. But I generally haven’t had much trouble with Japanese robots.

wilderness

3:30 pm on Aug 21, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



lucy,
It's of little importance, however only allow 64 thur 95 of the 202.180. range since 2003.
Has even lesser significance because the 64-95 was expanded to 127 and then the IP was purchased by a larger NZ provider.
In addition, any-Crawler-UA gets a 403.

tangor

9:55 am on Aug 22, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Me too! ^^^^