homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

Indy Library Mozilla/3.0 (compatible; Indy Library)

 3:53 am on Apr 6, 2001 (gmt 0)

I think this maybe from China. Almost always does a Head then Get and does a very deep crawl. Does not grab robots.txt - - [04/Apr/2001:03:41:53 -0500] "HEAD /./doh.htm HTTP/1.0" 200 186 "-" "Mozilla/3.0 (compatible; Indy Library)" - - [04/Apr/2001:03:41:56 -0500] "GET /./doh.htm HTTP/1.0" 200 3634 "-" "Mozilla/3.0 (compatible; Indy Library)"



 4:56 am on Apr 6, 2001 (gmt 0)

It hit my servers for several thousand pages. It looks to me like it is crawling excite listings.
This is them -> Capital Network, 8th/F Chian Resources Buiding, No.8 Jianguomenbei Avenue, Beijing,China [uk.gsmbox.com]


 7:51 pm on Apr 6, 2001 (gmt 0)

Am I the only one who finds it interesting that a the government of China is spidering web sites? Okay, so it isn't the Gov. of China outright, but it is a company partially owned by them.


 9:06 pm on Apr 6, 2001 (gmt 0)

they must be looking for that unforthcoming "apology"


 3:56 am on Apr 7, 2001 (gmt 0)

Thanks for the info littleman. They hit our site again and grab just about everything. I also wonder what they might be looking for.


 11:04 pm on Apr 10, 2001 (gmt 0)

Now others... which is a Gateway IP???

The original bot from China is now using two IPs.,165
And there are these. 166 is open on port 80, and running a win32 machine with apache.
inetnum -
descr CHINANET Beijing province network
descr Data Communication Division
descr China Telecom
country CN


 11:49 pm on Apr 10, 2001 (gmt 0)

Ah, but thats like saying the British Government is spidering you if BT decide to start indexing pages:)
It is a bit strange though, why would they bother?

jeremy goodrich

 1:53 pm on Apr 11, 2001 (gmt 0)

here's another one that hit me about a week back:

I've been wondering what's going on, they are very, very aggressive in getting at my stuff. It's not *that* interesting...oh well. Any takes on what to do about this one? I alread made my decision on these, but I'm curious what others are going to do. I figured let em have it, they'll probably laugh anyway.


 5:03 pm on Apr 11, 2001 (gmt 0)

They hit me too. About 20 requests per minute (one every 2-3 seconds). they also got robots.txt, which they seemed to honor (no deep crawl).

Atm. I'll let them crawl. they are, afaik just another spider. And I might eventually show up in a chinese portal/search engine.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved