Welcome to WebmasterWorld Guest from 54.234.38.8

Forum Moderators: Ocean10000 & incrediBILL

Message Too Old, No Replies

Indy Library

211.101.236.29 Mozilla/3.0 (compatible; Indy Library)

     
3:53 am on Apr 6, 2001 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 1, 2001
posts:183
votes: 0


I think this maybe from China. Almost always does a Head then Get and does a very deep crawl. Does not grab robots.txt

211.101.236.29 - - [04/Apr/2001:03:41:53 -0500] "HEAD /./doh.htm HTTP/1.0" 200 186 "-" "Mozilla/3.0 (compatible; Indy Library)"
211.101.236.29 - - [04/Apr/2001:03:41:56 -0500] "GET /./doh.htm HTTP/1.0" 200 3634 "-" "Mozilla/3.0 (compatible; Indy Library)"

4:56 am on Apr 6, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 17, 2000
posts:2924
votes: 0


It hit my servers for several thousand pages. It looks to me like it is crawling excite listings.
This is them -> Capital Network, 8th/F Chian Resources Buiding, No.8 Jianguomenbei Avenue, Beijing,China [uk.gsmbox.com]
7:51 pm on Apr 6, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 17, 2000
posts:2924
votes: 0


Am I the only one who finds it interesting that a the government of China is spidering web sites? Okay, so it isn't the Gov. of China outright, but it is a company partially owned by them.
9:06 pm on Apr 6, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Mar 22, 2001
posts:2450
votes: 0


they must be looking for that unforthcoming "apology"
3:56 am on Apr 7, 2001 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 1, 2001
posts:183
votes: 0


Thanks for the info littleman. They hit our site again and grab just about everything. I also wonder what they might be looking for.
11:04 pm on Apr 10, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member littleman is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:June 17, 2000
posts:2924
votes: 0


Now others...

63.251.176.140 which is a Gateway IP???

The original bot from China is now using two IPs.
211.101.236.28-29

202.108.221.166,165
And there are these. 166 is open on port 80, and running a win32 machine with apache.
inetnum 202.108.0.0 - 202.108.255.255
netname CHINANET-BJ
descr CHINANET Beijing province network
descr Data Communication Division
descr China Telecom
country CN

11:49 pm on Apr 10, 2001 (gmt 0)

Preferred Member

10+ Year Member

joined:Feb 21, 2001
posts:419
votes: 0


Ah, but thats like saying the British Government is spidering you if BT decide to start indexing pages:)
It is a bit strange though, why would they bother?
1:53 pm on Apr 11, 2001 (gmt 0)

Senior Member

WebmasterWorld Senior Member jeremy_goodrich is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Aug 4, 2000
posts:3468
votes: 0


here's another one that hit me about a week back:

202.108.221.165

I've been wondering what's going on, they are very, very aggressive in getting at my stuff. It's not *that* interesting...oh well. Any takes on what to do about this one? I alread made my decision on these, but I'm curious what others are going to do. I figured let em have it, they'll probably laugh anyway.

5:03 pm on Apr 11, 2001 (gmt 0)

Junior Member

10+ Year Member

joined:Dec 19, 2000
posts:193
votes: 0


They hit me too. About 20 requests per minute (one every 2-3 seconds). they also got robots.txt, which they seemed to honor (no deep crawl).

Atm. I'll let them crawl. they are, afaik just another spider. And I might eventually show up in a chinese portal/search engine.