homepage Welcome to WebmasterWorld Guest from 54.145.182.50
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Frequent visits from user agent libwww-perl/5.805
What's this?
Patrick Taylor

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3508143 posted 4:59 pm on Nov 18, 2007 (gmt 0)

Two pages on one of my websites have begun to receive frequent visits from bots with different host names but the user agent is always like:

libwww-perl/5.805
libwww-perl/5.64
libwww-perl/5.63

The requested URLs are typically:

ht*p://www.mysite.com/examplepage/index.php?page=ht*p://www.somesite.com/home/images/can?
ht*p://www.mysite.com//index.php?page=ht*p://somesite.nl/id.txt?

(my asterisks)

I've now redirected all such requests to remove index.php and the query string, but I'd be interested to know what's going on with these visits. Is it some sort of mischief?

It seems like a tracking system (or something).

 

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3508143 posted 2:04 pm on Nov 19, 2007 (gmt 0)

LWP is a well-known perl library (and perl module and and lwp-request which is a simple command line user agent).
it could be just about anything - a homemade bot or custom browser, somebody running a script - it's a very generic tool.

Patrick Taylor

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3508143 posted 11:00 pm on Nov 19, 2007 (gmt 0)

Okay, but why would someone want to run a bot requesting those URLs from different hosts?

[edited by: Patrick_Taylor at 11:02 pm (utc) on Nov. 19, 2007]

ikkyu

5+ Year Member



 
Msg#: 3508143 posted 1:43 am on Nov 20, 2007 (gmt 0)

It could be a proxy server, which may or may not be malicious, but some are known to zap your pages in the SERPs if they aren't handled correctly - IIRC they can be considered duplicate content by Google and end up getting the better ranking. There was a whole thread here concerning this problem, if this is the case for you its worth a read.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 3508143 posted 1:44 am on Nov 20, 2007 (gmt 0)

does that look at all like a url you might serve?

Patrick Taylor

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 3508143 posted 11:29 am on Nov 20, 2007 (gmt 0)

There's this thread -> [webmasterworld.com...]

Quite helpful. I suppose the safest thing is to serve a 403 to a bot that requests an URL like:

ht*p://www.mysite.com//index.php?page=ht*p://somesite.nl/id.txt?

And they're doing it dozens of times a day.

[edited by: Patrick_Taylor at 11:31 am (utc) on Nov. 20, 2007]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved