homepage Welcome to WebmasterWorld Guest from 54.197.110.151
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
I have not seen this one before.
miles

10+ Year Member



 
Msg#: 557 posted 4:41 pm on Apr 13, 2001 (gmt 0)

Does anyone know who this spider belongs to 209.247.40.106

Thanks.

 

theperlyking

10+ Year Member



 
Msg#: 557 posted 4:43 pm on Apr 13, 2001 (gmt 0)

Shows as an Alexa IP , see [alexa.com...]

I've had this crawl me a few times, rarely get any people referred from it though.

miles

10+ Year Member



 
Msg#: 557 posted 4:50 pm on Apr 13, 2001 (gmt 0)

Thanks for the info.

BoneHeadicus

10+ Year Member



 
Msg#: 557 posted 4:54 pm on Apr 13, 2001 (gmt 0)

Standard issue robots.txt:

User-agent: ia_archiver
Disallow: /

Alexa and I broke up a while back cuz she doesn't produce. She wanted to keep tabs on me but there's No Such Agreement on my part.

msgraph

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 557 posted 5:08 pm on Apr 13, 2001 (gmt 0)

How do they find our sites? Do they pull them off a specific search engine?Or do they just follow links from wherever they find them to an endless abyss?

I'm thinking about banning them too BH, but I want to know if they are selling their database out to any of the major players. Or at least set a limit to how many files they can grab. They are eating up way too much bandwidth

theperlyking

10+ Year Member



 
Msg#: 557 posted 5:13 pm on Apr 13, 2001 (gmt 0)

Arent alexa the ones who "power" the whats related feature in netscape (and maybe IE). If so perhaps this is where they pick our sites up from.
edit: - yep see [home.netscape.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved