homepage Welcome to WebmasterWorld Guest from 23.23.57.144
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Inktomi Spiders with UserAgent Mozilla - Pseudo or Real?
fantomaster

10+ Year Member



 
Msg#: 47 posted 7:39 pm on Apr 20, 2000 (gmt 0)

sneak preview:
--------------
from the forthcoming issue (vol.1/iss.004) of
fantomNews
======================================================
Inktomi Spiders with UserAgent Mozilla -
Pseudo or Real?
------------------------------------------------------
(bro) In recent weeks, rumors abounded in the search
engine optimization industry concerning Inktomi
spiders' newly detected usage of ordinary web browser
UserAgents in the course of crawling submitted sites.

However, the authenticity of these spiders has been
disputed by some search engine watchers. After
checking out the matter in some depth, read on to
learn what conclusions we have come to.

Here's some pertinent data ungrounded and evaluated by
our fantomas spiderScouts(TM) department.

Client site log entries
-----------------------
(data abridged; entry dates and file names modified to
protect client's privacy and cloaking setup)
----------------------------------------------------

209.185.141.185 - - [18/Apr/2000:03:45:49 -0700] "GET
/file1.htm HTTP/1.0" 200 416 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [18/Apr/2000:03:15:49 -0700] "GET
/file2.htm HTTP/1.0" 200 395 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [18/Apr/2000:03:15:49 -0700] "GET
/file3.htm HTTP/1.0" 200 437 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [18/Apr/2000:03:15:49 -0700] "GET
/file4.htm HTTP/1.0" 200 479 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [18/Apr/2000:03:15:49 -0700] "GET
/file5.htm HTTP/1.0" 200 499 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"

209.185.141.185 - - [17/Apr/2000:04:18:02 -0700] "GET
/file6.htm HTTP/1.0" 200 464 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [17/Apr/2000:04:17:58 -0700] "GET
/file7.htm HTTP/1.0" 200 388 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [17/Apr/2000:04:17:58 -0700] "GET
/file8.htm HTTP/1.0" 200 445 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [17/Apr/2000:04:17:58 -0700] "GET
/file9.htm HTTP/1.0" 200 359 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"
209.185.141.185 - - [17/Apr/2000:04:17:58 -0700] "GET
/file10.htm HTTP/1.0" 200 436 "-" "Mozilla/4.72 [en]
(X11; U; NetBSD 1.4.2 i386; Nav)"

These pages had all been submitted to Inktomi licensee
Anzwers. Only these pages were called - a behavior
quite typical for a spider. One single exception
aside, all calls took place at exactly the same time
(gauged by second).

The pages are all listed with dates April 17 and
April 18 respectively with Anzwers.

The spider was also detected when visiting our main
site at < [fantomaster.com...] > as can be seen from
the following logs:

2000-04-14, 00:18:12 -- j6000.inktomi.com --
209.185.141.185 -- Mozilla/4.72 [en] (X11; U; NetBSD
1.4.2 i386; Nav) -- -- file_a.html
2000-04-16, 00:23:01 -- j6000.inktomi.com --
209.185.141.185 -- Mozilla/4.72 [en] (X11; U; NetBSD
1.4.2 i386; Nav) -- -- file_b.html
2000-04-16, 00:23:01 -- j6000.inktomi.com --
209.185.141.185 -- Mozilla/4.72 [en] (X11; U; NetBSD
1.4.2 i386; Nav) -- -- file_c.html
2000-04-18, 00:44:27 -- j6000.inktomi.com --
209.185.141.185 -- Mozilla/4.72 [en] (X11; U; NetBSD
1.4.2 i386; Nav) -- -- file_c.html
2000-04-18, 00:44:27 -- j6000.inktomi.com --
209.185.141.185 -- Mozilla/4.72 [en] (X11; U; NetBSD
1.4.2 i386; Nav) -- -- file_d.html

Log entries at < [fantomaster.com...] > for the
newly detected Inktomi IP range:

si520.inktomi.com - - [17/Apr/2000:23:08:34 -0700]
"GET /robots.txt HTTP/1.0" 200 410
si520.inktomi.com - - [17/Apr/2000:23:08:36 -0700]
"GET / HTTP/1.0" 200 17516

New entry in our fantomas spiderSpy(TM) botBase:
------------------------------------------------
#UA Slurp/si (slurp@inktomi.com; [inktomi.com...]
si520.inktomi.com
209.67.206.133

Checking RWhois shows this IP range as belonging to:

auth-area 209.67.0.0/16
class-name network
network-name 209.67.206.0
ip-network 209.67.206.0/24
organization Ken Lutz
address-1 1900 Norfolk St suite 310
address-2 San Mateo, CA 94003
created 69-DEC-31
updated-by dave@exodus.net

Here are another two examples of IPs which have
been featured in our botBase since some time:

The IP 209.185.141.185 (j6000.inktomi.com)
belongs to:
auth-area 209.185.0.0/16
class-name network
network-name 209.185.136.0
ip-network 209.185.136.0/21
organization Eric Hollander
address-1 1900 South Norfolk Street #310
address-2 San Mateo, CA 94403
created 69-DEC-31
updated-by dave@exodus.net

The IP 209.1.12.1 (wyatt-sc-vlan-12.inktomi.com)
belongs to:
auth-area 209.1.0.0/16
class-name network
network-name 209.1.12.0
ip-network 209.1.12.0/24
organization Inktomi Corporation
address-1 2168 SHattuck Ave. Suite 210
address-2 Berkeley, CA 94704
created 97-NOV-21
updated-by dave@exodus.net

Conclusion
----------
Inktomi's IP ranges were registered by different
entities.
Seeing that the postal addresses (Street, City)
for the two IP ranges discussed above are identical,
it stands to reason that the new Inktomi IP
(209.67.206.133) is part and parcel of a genuine
Inktomi IP range.

Source:
fantomNews, vol. 1/issue 004 (forthcoming)
fantomNews
++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

VAL@Amsterdam

10+ Year Member



 
Msg#: 47 posted 2:19 am on May 3, 2000 (gmt 0)

Wow, this looks quite impressive. hmm...

How can one checkup the IP-range owned by an SE?

BTW: to everyone reading this; I think "spiderspy" really is priceworth! I use it myself. It's great. -
When will issue number three from Famtomnews be released? Still waiting for Tutorial Part 3 for the Stealth, Cloaking, Phantom Technology: How it works - and how it doesn't. It is some while ago now. I guess you haven't got the time anymore..

fantomaster

10+ Year Member



 
Msg#: 47 posted 12:28 am on May 5, 2000 (gmt 0)

Thanks for your kind words re spiderSpy!

Expect fantomNews #4 to be out sometime next week.
And it's not about "not having the time anymore" -
we *never* really had the time for it in the first place .
But seriously: we have never committed ourselves to a regular
publication schedule, and while the earlier issues were out
about once a month, we're reserving the right to publish whenever
it suits us best. We had some severe local hardware problems recently
(two systems went bust completely in only one week) and are still
coping with that (new systems are in place now, but you
may know how it is - it takes ages to get everything going
again the way it should), hence some delay.

VAL@Amsterdam

10+ Year Member



 
Msg#: 47 posted 11:09 pm on May 13, 2000 (gmt 0)

Yes I know how it is. It happend 5 times in the last 6 months to me .... Ages....

Good luck

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved