homepage Welcome to WebmasterWorld Guest from 54.237.78.165
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
Download https web page with LWP UserAgent
cocrmc



 
Msg#: 4180681 posted 4:55 pm on Aug 2, 2010 (gmt 0)

I'm trying to get voip call detail logs from cp.voipstreet.com using LWP::UserAgent but supplying the credentials to LWP doesn't work. I don't know if it is because I don't know the realm name or because the site is programmed differently. Another approach that I think would work for me is if I could figure out how to use LWP to post the username on the login page then post to the page where I'm trying to download the CDRs from. At least that's how an interactive browser works. I log in then past my url and the file prompts for download. Please check out the code below and tell me where I'm going wrong or give me some ideas how I can retain session state between pages with LWP. Thanks!

#!/usr/bin/perl
#
############################################
# gets (should) cdr records from voipstreet
# note: requires https module installed in Perl
############################################
#
use strict;
use warnings;
use LWP;

my $browser = LWP::UserAgent->new;

$browser->credentials(
'cp.voipstreet.com:443',
'missing-realm-name-here-what-is-it',
'myusernamehere' => 'mypasswordhere'
);

my $url = "https://cp.voipstreet.com/vs/usage_new.php";
my $response = $browser->post( $url,
{'dowhat' => 'down',
'FromDate' => '2010-08-01',
'ToDate' => '2010-08-02',
'df' => '1',
'go' => '1',
});

print $response->content;

 

cocrmc



 
Msg#: 4180681 posted 5:34 pm on Aug 2, 2010 (gmt 0)

I've gotten a little closer now. I've enabled cookies for LWP and posted two requests to the server. The first request authenticates, the second one gets the page. The cookies retain auth between pages. The problem I'm having now is I don't want to download the page at the url in the second request. What I want to do is download an attachment to that page. When that page receives a request it returns a contenttype-attachment and on a normal browser I can just hit save but with perl/lwp it downloads the page. Below is my new script. I found an article here [webmasterworld.com...] but it is retrieving the filename. I want the actual attached file. Any ideas?

#!/usr/bin/perl
#
############################################
# gets CDR records from voipstreet
############################################
#
use strict;
use warnings;
use LWP;

my $browser = LWP::UserAgent->new;
$browser->cookie_jar( {} );

my $url = "https://cp.voipstreet.com/vs/index_proc.php";
my $response = $browser->post( $url,
{'UserName' => 'myusernamehere',
'Password' => 'mypasswordhere',
'ReturnURL' => 'index.php',
'B1' => 'Login Now',
});

my $url = "https://cp.voipstreet.com/vs/usage_new.php";
my $response = $browser->get( $url,
{'dowhat' => 'down',
'optcrit' => '',
'an' => '',
'gi' => '',
'di' => '',
'FromDate' => '2010-08-01',
'ToDate' => '2010-08-02',
'df' => '1',
'go' => '1',
});

janharders

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4180681 posted 8:04 pm on Aug 2, 2010 (gmt 0)

When that page receives a request it returns a contenttype-attachment and on a normal browser I can just hit save but with perl/lwp it downloads the page.

Which page? What's in that page? does it redirect to the download file? I personally find it clearer to build a request with HTTP::Request and then run it with
$browser->request( $request ); # will follow redirects
or
$browser->simple_request( $request ); # will not follow redirects
so you can see what exactly you are sending to the server.
also, add a
print $response->as_string;
to see, what exactly the server is sending to you. perl does not decide wether to download or not to download things, it just gets things from the web, what happens to those things is up to you. If you don't get the same content under some url with perl and a browser, that's most likely the server, not your script (or rather: the server because he does not like what your script sends).

cocrmc



 
Msg#: 4180681 posted 9:46 pm on Aug 2, 2010 (gmt 0)

Sorry for the unclear explaination.

The index_proc.php page processes the login form. My first $browser-> post is sucessful and the user/password is stored in the cookies.

Then I $browser->post the usage_new.php page and it returns results except it's a html formated table. At the same time in the response header is a .CSV attachment which is all I want to download. If I use a browser to hit the same pages the browser prompts me to download. Perl doesn't! I'm doing something wrong.

How do I get the download that is in the response header?

janharders

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4180681 posted 12:13 am on Aug 3, 2010 (gmt 0)

perl will never "prompt" you, it's not running interactivly.

add a

print $response->headers->as_string;

at the end to see, what the server actually sends.

cocrmc



 
Msg#: 4180681 posted 2:26 am on Aug 3, 2010 (gmt 0)

Just to clarify when I said "prompt" I meant that firefox prompts me.

When I 'sniff' the headers I get a different response that when I tried what you suggested. Here are the headers by your method:

Date: Tue, 03 Aug 2010 02:21:12 GMT
Server: Apache/2.2.4 (Unix) mod_ssl/2.2.4 OpenSSL/0.9.8d PHP/5.2.9
Content-Language: en-us
Content-Type: text/html
Content-Type: text/html; charset=windows-1252
Client-Date: Tue, 03 Aug 2010 02:21:10 GMT
Client-Peer: 64.136.174.26:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=US/O=VeriSign, Inc./OU=VeriSign Trust Network/OU=Terms of use at https://www.verisign.com/rpa (c)09/CN=VeriSign Class 3 Secure Server CA - G2
Client-SSL-Cert-Subject: /C=US/ST=Pennsylvania/L=Pittsburgh/O=AD-BASE SYSTEMS INC/OU=Terms of use at www.verisign.com/rpa (c)05/CN=cp.voipstreet.com
Client-SSL-Cipher: DHE-RSA-AES256-SHA
Client-SSL-Warning: Peer certificate not verified
Client-Transfer-Encoding: chunked
Title: VoIPStreet - Account Usage
X-Powered-By: PHP/5.2.9

here are the headers by sniffing them:
HTTP/1.1 200 OK
Date: Tue, 03 Aug 2010 02:15:33 GMT
Server: Apache/2.2.4 (Unix) mod_ssl/2.2.4 OpenSSL/0.9.8d PHP/5.2.9
X-Powered-By: PHP/5.2.9
Pragma: public
Expires: 0
Cache-Control: public
Content-Description: File Transfer
Content-Disposition: attachment; filename=usages1280801735.csv;
Content-Transfer-Encoding: binary
Content-Length: 146533
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/force-download

this is the url that I use in firefox:

https://cp.voipstreet.com/vs/usage_new.php?dowhat=down&optcrit=&an=&gi=&di=&FromDate=2010-07-01&ToDate=2010-07-31&df=1&go=1

cocrmc



 
Msg#: 4180681 posted 2:28 am on Aug 3, 2010 (gmt 0)

Maybe the server is treating it different because their two different browsers?

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4180681 posted 4:26 am on Aug 3, 2010 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], cocrmc!

Client-SSL-Warning: Peer certificate not verified

i'm guessing the SSL negotiation is returning this response instead of the server responding with your expected attachment.
perhaps there is a way to disable the certificate verification...

janharders

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 4180681 posted 9:34 am on Aug 3, 2010 (gmt 0)


Client-SSL-Warning: Peer certificate not verified


i'm guessing the SSL negotiation is returning this response instead of the server responding with your expected attachment.


IIRC that's really just a warning and doesn't change the content.


cocrmc: the sniffed headers are what is returned to firefox, right?
Have you looked at the html-content it returns? Maybe the server is telling you that you're not logged in or forbidden from automatically using the site (hence the title it sends: "VoIPStreet - Account Usage").
Look at the requests you send to the server (since you're familiar with sniffing, do that) to make sure that firefox isn't sending any extra cookies LWP missed. If that's not it: you can set your own useragent and headers to be sent, just copy the values from firefox and see wether that makes the server return the data.

cocrmc



 
Msg#: 4180681 posted 8:22 pm on Aug 3, 2010 (gmt 0)

SOLUTION: Thank you all for the comments. It ended being pretty simple. I was doing something wrong.

From my orignal code:
my $url = "https://cp.voipstreet.com/vs/usage_new.php";
my $response = $browser->get( $url,
{'dowhat' => 'down',
'optcrit' => '',
'an' => '',
'gi' => '',
'di' => '',
'FromDate' => '2010-08-01',
'ToDate' => '2010-08-02',
'df' => '1',
'go' => '1',
});

changed to:
my $url = "https://cp.voipstreet.com/vs/usage_new.php?dowhat=down&optcrit=&an=&gi=&di=&FromDate=2010-07-01&ToDate=2010-07-31&df=1&go=1";
my $response = $browser->get( $url )

Thank you again!

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved