Forum Moderators: coopster & phranque

Message Too Old, No Replies

Failing to get web page

Failing to get web page, but only for some pages

         

parcel2ship

11:38 am on Oct 12, 2008 (gmt 0)

10+ Year Member



I am trying to write a script to get a web page and then parse the resulting content but the 'get' is failing to get the web page I require. It does however work with other web pages. I am guessing that I am missing some additional parameters but I am very much a newby so am having trouble working out what I need. Any help would be much appreciated. The script is:
>>>>>>>>>>>>>>>>>>>>
#!c:\perl\bin\perl.exe -w
use strict;
use warnings;

use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
use HTML::LinkExtor; # allows you to extract the links off of an HTML page.

print "Content-type: text/html\n\n"; # this is just for testing
#
my $url = 'http://www.example.com/portal/pw/index.html';
my $content = get $url;
die "Couldn't get $url" unless defined $content;
>>>>>>>>>>>>>>

When I run the script I am getting 'Couldn't get http://www.example.com/portal/pw/index.html'. As I said above if I try the same script to get a different web page (e.g. http://www.example.co.uk/index.html) it works ok.

Rgds

Denis

[edited by: jatar_k at 11:41 am (utc) on Oct. 12, 2008]
[edit reason] please use example.tld [/edit]

phranque

11:10 pm on Oct 12, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld [webmasterworld.com], Denis!

try using LWP::UserAgent methods instead:

use LWP::UserAgent;

print "Content-type: text/html\n\n"; # this is just for testing

my $ua = LWP::UserAgent->new;
my $response = $ua->get('http://www.example.com/portal/pw/index.html');

if($response->is_success){
print $response->decoded_content;
}else{
die $response->status_line;
}

parcel2ship

4:20 pm on Oct 13, 2008 (gmt 0)

10+ Year Member



Thanks for the welcome and the help.

Using the above I now get '500 Internal Server Error'. There is nothing in the server error log to indicate the type of error. It also still works for some sites but not others.

Any more guidance would be appreciated

phranque

6:00 am on Oct 14, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



that was intended to be a code snippet, not a complete script.
for example, did you include the shebang line?
#!c:\perl\bin\perl.exe -w

otherwise there will be a message somewhere.
have you tried running the script from the command line?

parcel2ship

7:04 am on Oct 15, 2008 (gmt 0)

10+ Year Member



Yes I did include the Shebang line and have tried to run it from the command line. The problem is something to do with which site I try to 'get'. If I try certain sites it works fine.If I try others (including the one I am really after) I get the error. Maybe certain sites need a variation on the get?

phranque

8:02 am on Oct 15, 2008 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



what is the value for $response after the get?

if it looks like a response object, it should have a code or status message available...