Forum Moderators: coopster & phranque

Message Too Old, No Replies

LWP:Simple get() doesn't work with secure pages, here's an alternative

         

csdude55

9:55 pm on Dec 4, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Back in the early to mid-2000s, I built a bunch of Perl scripts for clients that imported a PHP script using:

# where $home is predefined with the current website address
use LWP::Simple;
$foo = get("$home/foo.php");
print foo;


I've only recently discovered that LWP::Simple doesn't support SSL scripts, so now that I'm forcing SSL on all pages those get() functions fail. They're not fatal errors, they just print nothing.

I believe that LWP::Simple has a method where you can ignore SSL, but I couldn't figure it out. And it would require a modification to every script, so it wouldn't be ideal for me, anyway.

I discovered that HTTP::Tiny works fine with SSL scripts, though, so I'm sharing my solutions with others.

If you only have a few scripts, then this modification works:

# remove LWP::Simple;
# use LWP::Simple;

# replace it with HTTP::Tiny
use HTTP::Tiny;

# alternative get()
sub get {
return HTTP::Tiny->new->get($_[0])->{content};
}


But in my case, I have several accounts that use LWP::Simple, and it's not practical to modify all of them manually if I don't have to. So instead, I modified LWP::Simple directly. I don't know if any OS or cPanel scripts use LWP::Simple, but I would be surprised if they did. So I'm making an educated guess here that this will be OK.

On my server, the path to the module is:

/usr/local/share/perl5/LWP/Simple.pm

but your path may be different. Be sure to make a backup of Simple.pm, just in case.

This is the get() section of the LWP::Simple module:

sub get ($)
{
my $response = $ua->get(shift);
return $response->decoded_content if $response->is_success;
return undef;
}


And this is my modification:

use HTTP::Tiny;

sub get ($) {
my $response = HTTP::Tiny->new->get(shift);
return $response->{content} if $response->{success};
return undef;
}


The docs on HTTP::Tiny:
[metacpan.org...]

I suspect that this will be marginally slower than the original, but in my case they're all low traffic sites so it's OK (within reason, of course).

I welcome anyone else to post suggestions or warnings! For me this was a relatively quick fix on a problem that was hard to track down, though, so I'm hoping it can help others that might have the same problem.

Key_Master

10:52 pm on Dec 4, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm assuming you're using LWP to access local files or across domains you have server control over? I ran into the same problem earlier this year with a search engine crawler script to index content on my server. I simply added a one liner to the SSL redirect to allow the crawler IP to access the HTTP version of the site.


# SSL Redirect
RewriteCond %{HTTPS} !=on
RewriteCond %{REMOTE_ADDR} !^192\.168\.0\.1$
RewriteRule ^/?(.*) https://www.example.com/$1 [R=301,L]

csdude55

7:50 am on Dec 5, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You're right, I'm exclusively using this to access a file on the same domain as the Perl script. But since it had to be executed as PHP, I didn't know any other way than to load the output.

But that's a great a simple modification, too :-)

Brett_Tabke

3:21 pm on Dec 5, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I swapped out to wget and eliminated lwp from my scripts. Never looked back.

phranque

2:25 am on Dec 6, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I've only recently discovered that LWP::Simple doesn't support SSL scripts, so now that I'm forcing SSL on all pages those get() functions fail. They're not fatal errors, they just print nothing.

LWP::Simple's get method returns the requested document or undef if it fails.
this means you didn't actually discover the cause of your empty document.
since this is specific to SSL, it is probably a secure connection issue and not a problem with the user agent.

I'm exclusively using this to access a file on the same domain as the Perl script.

i would put money on this somehow being a problem with your secure certificates.


I swapped out to wget and eliminated lwp from my scripts. Never looked back.

not sure wget would fix this problem.
LWP::Simple can export a LWP::UserAgent [metacpan.org] object which would be useful in chasing this down.

Brett_Tabke

2:46 pm on Dec 6, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



> not sure wget would fix this problem

That is why I switched - ssl works great with wget.

csdude55

10:44 pm on Dec 6, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



FWIW, @phranque, here's where I read that LWP::Simple didn't support HTTPS connections, and the recommendation to switch to HTTP::Tiny:

[perlmonks.org...]

It is worth noting that my server hasn't updated to HTTP/2 yet, though, so it's entirely possible that the certificate assigned by cPanel just isn't compatible. I honestly don't know enough about SSL certs to know.

phranque

12:42 am on Dec 7, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



that Perl Monks thread precisely describes the issue that I surmised above.

i.e. the problem is not that HTTPS is unsupported.
the problem is that the secure certificate/chain is not correctly configured and therefore the request is failing to get past the secure handshake.

the only advantage HTTP::Tiny (and perhaps wget) gives you is an easier method of bypassing the secure handshake.

phranque

12:47 am on Dec 7, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



i would focus on noxxi's reply (Dec 31, 2017 at 20:46 UTC) in the PerlMonks thread you linked to above:
Certificate validation is there for a reason and simply recommending to switch it off is essentially suggesting to abandon any security provided by https since man in the middle attacks are easy if proper certificate validation is not done. The real reason that the code fails is a broken setup of the target site. This can be worked around in a secure way by setting SSL_ca_file to the appropriate CA certificates or by using SSL_fingerprint.

csdude55

1:12 am on Dec 7, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This can be worked around in a secure way by setting SSL_ca_file to the appropriate CA certificates or by using SSL_fingerprint.

But... how? I read his explanation on StackOverflow, too (link below), but it's Greek to me. And I don't think it says how to "set SSL_ca_file to the appropriate CA certificate" or to to use SSL_fingerprint.

[stackoverflow.com...]

phranque

1:53 am on Dec 7, 2020 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



And I don't think it says how to "set SSL_ca_file to the appropriate CA certificate" or to to use SSL_fingerprint.

i saw a pretty good example of each option there.

have you tried starting with analyzing your ssl configuration?
for example, here is a sample SSL test for www.WebmasterWorld.com:
[ssllabs.com...]