homepage Welcome to WebmasterWorld Guest from 54.205.207.53
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
207.230.106.188 DIIbot/1.1, www.findsame.com, robot@digital-
littleman

WebmasterWorld Senior Member littleman us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 99 posted 8:56 pm on Aug 3, 2000 (gmt 0)

These guys have been snooping for a while. It looks like they are now using there snooping bot to also build an SE. It is an interesting concept. Looks like they are raiding inktomi for urls.
www.findsame.com [findsame.com]
digital-integrity.com [digital-integrity.com]

 

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 99 posted 9:13 pm on Aug 3, 2000 (gmt 0)

Is it crawler behavior though? Just just single page pulls?

littleman

WebmasterWorld Senior Member littleman us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 99 posted 9:22 pm on Aug 3, 2000 (gmt 0)

Yeah, it looks that way. It seems to be *slowly* following links. It also has been pulling the robot.txt for every <added>root</added> request.


redzone

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 99 posted 2:53 am on Aug 4, 2000 (gmt 0)

Brett/Littleman,
Noticed increased action from them also.. Have either of you had stepped up crawling from Matahari recently? They used to just hit and miss us, but over the last few days, have been hitting huge numbers of URL's....

PeteU

10+ Year Member



 
Msg#: 99 posted 5:26 am on Aug 4, 2000 (gmt 0)

Bandwidth waste like this outfits earn a
deny from ip_range
entry in my access.conf files

Brett_Tabke

WebmasterWorld Administrator brett_tabke us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 99 posted 12:17 pm on Aug 4, 2000 (gmt 0)

I was thinking the findsame was just random stuff.
I can only find a few hits from digital integrity.

Pete, access.conf, nice work if you can get it - the rest of us are stuck with .haccess banning (slow).

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved