homepage Welcome to WebmasterWorld Guest from 54.205.144.54
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Who are these?
Hope




msg:398555
 12:47 pm on Oct 5, 2000 (gmt 0)

I have been seeing a lot of activity from these guys. Does anyone know who they are?

Dllbot
Excalibur Internet Spider V6.5.4
GenCrawler
GentleSpider
Kenjin Spider
oBot ((compatible;Win32))
xyro_(xcrawler@cosmos.inria.fr)

 

littleman




msg:398556
 10:26 pm on Oct 5, 2000 (gmt 0)

Hi Hope, I like your nick
We had a similar thread here [webmasterworld.com] - some of the bots on your list are in there.
DIIbot -> [digital-integrity.com ]
"Digital Integrity's patent-pending technology is the solution for
discovering and tracking all types of digital content. The power of
this technology is its ability to detect and track digital "content
segments" -- such as a Microsoft Word document or document
phrase, a PowerPoint slide, a JPEG file, a line of code, HTML, or
even MP3. But unlike a search engine, Digital Integrity's
technology can discover content segments of any length, in any
format."

I don't know what GenCrawler, GentleSpider, Kenjin Spider and oBot ((compatible;Win32)) are - they are probably personal applications. I seem to recall a perl script Bot that was called 'GentleSpider'. If you have the spiders IPs we might be able to do some more tracking.

Hope




msg:398557
 11:03 pm on Oct 5, 2000 (gmt 0)

littleman,

I was really hoping you wouldn't tell me to find the IP. I really hate looking at raw logs. I have heard they can make you go blind. ;) I will take a look and do a search on the IP.


littleman




msg:398558
 11:25 pm on Oct 5, 2000 (gmt 0)

Odds are good oBot ((compatible;Win32)) is some type of desktop app. The thing about User_Agent is that anyone with a bit of skill could hit your site using any UA he/she wants - so the name it self doesn't mean much.

Hope




msg:398559
 11:49 pm on Oct 5, 2000 (gmt 0)

You really know how to make my evening littleman. I was hoping they were all se spiders. It makes me nervous to think someone is spidering our entire site that many times. I wish I knew why. Oh well, guess I should go take a look at the logs to see what they were after and if they took a look at robots.txt.

Air




msg:398560
 2:56 am on Oct 6, 2000 (gmt 0)

That's a good idea Hope, I'd be interested in what you find.

Hope




msg:398561
 1:10 pm on Oct 9, 2000 (gmt 0)

oBot is very polite. First thing it got was the robots.txt.

The IP for oBot is 195.127.173.165.

The other questionable bots listed above did not look for robots.txt.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved