homepage Welcome to WebmasterWorld Guest from 54.243.17.133
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Indy Library
211.101.236.29 Mozilla/3.0 (compatible; Indy Library)
Son_House




msg:398959
 3:53 am on Apr 6, 2001 (gmt 0)

I think this maybe from China. Almost always does a Head then Get and does a very deep crawl. Does not grab robots.txt

211.101.236.29 - - [04/Apr/2001:03:41:53 -0500] "HEAD /./doh.htm HTTP/1.0" 200 186 "-" "Mozilla/3.0 (compatible; Indy Library)"
211.101.236.29 - - [04/Apr/2001:03:41:56 -0500] "GET /./doh.htm HTTP/1.0" 200 3634 "-" "Mozilla/3.0 (compatible; Indy Library)"

 

littleman




msg:398960
 4:56 am on Apr 6, 2001 (gmt 0)

It hit my servers for several thousand pages. It looks to me like it is crawling excite listings.
This is them -> Capital Network, 8th/F Chian Resources Buiding, No.8 Jianguomenbei Avenue, Beijing,China [uk.gsmbox.com]

littleman




msg:398961
 7:51 pm on Apr 6, 2001 (gmt 0)

Am I the only one who finds it interesting that a the government of China is spidering web sites? Okay, so it isn't the Gov. of China outright, but it is a company partially owned by them.

volatilegx




msg:398962
 9:06 pm on Apr 6, 2001 (gmt 0)

they must be looking for that unforthcoming "apology"

Son_House




msg:398963
 3:56 am on Apr 7, 2001 (gmt 0)

Thanks for the info littleman. They hit our site again and grab just about everything. I also wonder what they might be looking for.

littleman




msg:398964
 11:04 pm on Apr 10, 2001 (gmt 0)

Now others...

63.251.176.140 which is a Gateway IP???

The original bot from China is now using two IPs.
211.101.236.28-29

202.108.221.166,165
And there are these. 166 is open on port 80, and running a win32 machine with apache.
inetnum 202.108.0.0 - 202.108.255.255
netname CHINANET-BJ
descr CHINANET Beijing province network
descr Data Communication Division
descr China Telecom
country CN

theperlyking




msg:398965
 11:49 pm on Apr 10, 2001 (gmt 0)

Ah, but thats like saying the British Government is spidering you if BT decide to start indexing pages:)
It is a bit strange though, why would they bother?

jeremy goodrich




msg:398966
 1:53 pm on Apr 11, 2001 (gmt 0)

here's another one that hit me about a week back:

202.108.221.165

I've been wondering what's going on, they are very, very aggressive in getting at my stuff. It's not *that* interesting...oh well. Any takes on what to do about this one? I alread made my decision on these, but I'm curious what others are going to do. I figured let em have it, they'll probably laugh anyway.

skirril




msg:398967
 5:03 pm on Apr 11, 2001 (gmt 0)

They hit me too. About 20 requests per minute (one every 2-3 seconds). they also got robots.txt, which they seemed to honor (no deep crawl).

Atm. I'll let them crawl. they are, afaik just another spider. And I might eventually show up in a chinese portal/search engine.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved