Forum Moderators: open

Message Too Old, No Replies

LWP::Simple

heads up

         

wilderness

5:57 pm on Jun 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



There are numerous threads in the archives.

I've not seen this UA in a while.
Single page. No robots. No Images.

69.10.141.84 - - [20/Jun/2003:19:52:38 -0700] "GET / HTTP/1.1" 200 9409 "-" "LWP::Simple/5.69"

http ://search.cpan.org/author/GAAS/libwww-perl/lib/LWP/Simple.pm

rbs10025

10:09 pm on Jun 22, 2003 (gmt 0)

10+ Year Member



LWP::Simple is a Perl module which can be used for various activities. I've used it myself for a couple purposes, one of the more complex being a CGI which took requests to one webserver, manipulated the input, and then used the LWP modules to query a second server for an answer to the user's query.

Since LWP::Simple a module, whatever it's doing is up to whoever wrote the application it's been plugged into. It might be used for mailicious purposes, such as feeding a spam database, and it might not.

The is a method in LWP to change the UA name to whatever you want. If you're seeing LWP::Simple in your logs, then the coder hasn't bothered to do so.

volatilegx

8:06 pm on Jun 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually LWP::Simple doesn't offer a method to change the User-Agent string.

You have to use LWP::UserAgent for that, which module is a little more complicated and powerful than LWP::Simple.

Dan

ThatAdamGuy

12:52 am on Jul 6, 2003 (gmt 0)

10+ Year Member



Hmm... interesting.

I've been getting a lot of errors recently for pages like:
blog.mysite.com/archives/000256.htm%20/

with the space and trailing URL mysteriously added on.

I've scoured my site to try to find errant links, but have been unable to do so.

The user agent with all these errors is: lwp::Simple/5.69

So does this mean I shouldn't be worried about the 404s? Or should I still try to track down what's wrong with my links?

volatilegx

1:54 am on Jul 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sounds to me like somebody's got a bot that's appending a space on the end of the link URL. I'd bann 'em. No legetimate search engine is using LWP::Simple as the user agent for their spider.

rduke

9:32 am on Jul 20, 2003 (gmt 0)



No legetimate search engine is using LWP::Simple as the user agent for their spider.

But "legitimate humans" can.
I wouldn't block that user agent. If someone uses Perl for "bad" purposes, he would probably also change the user agent to something else.

wilderness

10:28 am on Jul 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If someone uses Perl for "bad" purposes, he would probably also change the user agent to something else.

rduke
Welcome to Webmaster World.

I must confess this is a new criteria for NOT denying a UA ;)

The theme in this forum is identifying spiders. Most folks are aware that Ip's, referrers and UA's can be modified, omitted and even faked and yet spiders are identified.
The suggestion and assistance (off-topic) is shared to assist others in denying these non-compliant spiders.

Just the other day I had a visitor which downloaded a zip from a page they had not accessed. Although I didn't take any action at that time, I did make note of the IP and UA for future reference. Even though I only a have a few ZIPs for downloading, I may add zip to my deep-linking rules.

volatilegx

5:25 pm on Jul 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<nitpick>

The theme in this forum is identifying spiders

Actually the theme is identifying search engine spiders. We only identify other spiders to infer that they do not belong to search engines, and thus are candidates for some kind of action like banning, etc.

</nitpick>

ncw164x

7:23 pm on Jul 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No legetimate search engine is using LWP::Simple as the user agent for their spider.:)

Your right but it is used by directories to check the links of the sites they have listed
LWP::Simple/5.69 is the latest version