homepage Welcome to WebmasterWorld Guest from 107.20.109.52
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
216.32.64.10 ?
mark roach




msg:403024
 2:25 pm on Sep 21, 2000 (gmt 0)

Any ideas where 216.32.64.10 is from ?

It took about 400 pages from 200 domains yesterday.

20 hours prior to that Architext took my robots.txt (and nothing else) on those domains, Coincidence ?

 

mark roach




msg:403025
 3:42 pm on Sep 21, 2000 (gmt 0)

Just found it on another domain of mine. Some of the pages it is taking have question marks in the urls.

Last month lycos.co.uk indexed some pages containing question marks in the URLs. Direct hit grabber came to this site for the 1st time ever yesterday and also took pages with question marks in the URLs. I think I need a robots.txt on my cgi directory.

PeteU




msg:403026
 8:09 pm on Sep 21, 2000 (gmt 0)

216.32.64.10 is [cyveillance.com...]
waste of bandwidth

NFFC




msg:403027
 8:23 pm on Sep 21, 2000 (gmt 0)

>We do honor robots.txt files by following rules designated for all user agents (the wildcard character '*').

Does this mean that they cannot be excluded, except by excluding all spiders?

littleman




msg:403028
 8:23 pm on Sep 21, 2000 (gmt 0)

deja vu

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved