homepage Welcome to WebmasterWorld Guest from 54.163.72.86
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
Mr.Carlito
idiotgirl




msg:3691494
 7:51 am on Jul 6, 2008 (gmt 0)

Here's one I haven't seen before:
64.237.57.*** - - [05/Jul/2008:20:28:36 -0400] "GET / HTTP/1.1" 200 7643 "-" "Mozilla/5.0 (MrCarlito-0.1 http://www.mrcarlito.com/spider.html)"

Didn't check robots.txt. The reference page says:
MrCarlito-0.1 is an experimental spider that collects header & link information from web pages. The spider is written in PERL (Practical Extraction and Report Language), and uses the LWP::UserAgent Class. Currently this spider does not delve into websites, it simply obtains the headers & hostnames contained in your web page index.

IMHO - it would be more polite if Mr.Carlito bothered to check with robots.txt to see if he's welcome. I guess that's not Carlito's Way.

[edited by: incrediBILL at 8:12 pm (utc) on July 6, 2008]
[edit reason] fixed formatting and link [/edit]

 

wilderness




msg:3691738
 8:43 pm on Jul 6, 2008 (gmt 0)

64.237.57.zzz - - [28/Oct/2007:17:36:18 -0500] "GET / HTTP/1.1" 301 313 "-" "Mozilla/5.0 (MrCarlito-0.1 [mrcarlito.com...]

added the backbones 32-63 Class C as a result.

Megaclinium




msg:3691912
 4:13 am on Jul 7, 2008 (gmt 0)

Is there a maximum # of addresses or ranges over which adding more to your IP deny list slows down the server?

wilderness




msg:3692165
 1:36 pm on Jul 7, 2008 (gmt 0)

I've not heard of anybody hitting limit walls when utilizing "simple" IP or UA denies.

My own file some 1,700 lines, having been condensed multiple times.

What will slow requests down are processor intensive rules.

Don

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved