homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum


 7:51 am on Jul 6, 2008 (gmt 0)

Here's one I haven't seen before:
64.237.57.*** - - [05/Jul/2008:20:28:36 -0400] "GET / HTTP/1.1" 200 7643 "-" "Mozilla/5.0 (MrCarlito-0.1 http://www.mrcarlito.com/spider.html)"

Didn't check robots.txt. The reference page says:
MrCarlito-0.1 is an experimental spider that collects header & link information from web pages. The spider is written in PERL (Practical Extraction and Report Language), and uses the LWP::UserAgent Class. Currently this spider does not delve into websites, it simply obtains the headers & hostnames contained in your web page index.

IMHO - it would be more polite if Mr.Carlito bothered to check with robots.txt to see if he's welcome. I guess that's not Carlito's Way.

[edited by: incrediBILL at 8:12 pm (utc) on July 6, 2008]
[edit reason] fixed formatting and link [/edit]



 8:43 pm on Jul 6, 2008 (gmt 0)

64.237.57.zzz - - [28/Oct/2007:17:36:18 -0500] "GET / HTTP/1.1" 301 313 "-" "Mozilla/5.0 (MrCarlito-0.1 [mrcarlito.com...]

added the backbones 32-63 Class C as a result.


 4:13 am on Jul 7, 2008 (gmt 0)

Is there a maximum # of addresses or ranges over which adding more to your IP deny list slows down the server?


 1:36 pm on Jul 7, 2008 (gmt 0)

I've not heard of anybody hitting limit walls when utilizing "simple" IP or UA denies.

My own file some 1,700 lines, having been condensed multiple times.

What will slow requests down are processor intensive rules.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved