|LWP doesn't get the file|
LWP works internally but not externally
| 6:32 pm on Jul 12, 2010 (gmt 0)|
I've been working with a PHP script that gets content but the type of content I need now is just too much for PHP so I'm using Perl. Note: the PHP script does work for a simpler data file, so the server is connected to the Internet and does get the data.
The script works fine, use LWP::Simple; $file = get('myURL'); all work when I use the internal URL but not external.
For example, if I need the file from yahoo/test.html and I enter that in the get(), the $file has nothing in it (Yes I use the ".com", but I don't want the URL filter to work for my example).
However, if I go to that file, view source, and save that as test.html, and run with myDomain/test.html, then it works just fine.
Anyone have any ideas why this is? The file I need isn't under a protected login and it's not a large file (it's an HTML file).
| 10:24 pm on Jul 12, 2010 (gmt 0)|
If it requires a HTTP-Auth, you'll need to pass that info. I don't think LWP::Simple will do that. User LWP::UserAgent, which gives you more control. Look at the LWP Cookbook [search.cpan.org] for examples.
| 3:32 am on Jul 13, 2010 (gmt 0)|
welcome to WebmasterWorld [webmasterworld.com], SsurebreC!
have you tried testing the returned HTTP response code?
you must use other LWP::Simple methods besides get or LWP::UserAgent methods for this.
it is possible that your user agent is being blocked?
have you tried a wide variety of urls?
are your urls fully qualified including the scheme (http://...)?
| 12:25 pm on Jul 13, 2010 (gmt 0)|
Check your firewall settings and see that Perl is allowed. Sometimes you can have a blocked program and windows won't complain when it tries to get to the outside world.
| 12:39 pm on Jul 13, 2010 (gmt 0)|
Thanks for the helpful suggestions. Here are some replies:
1) the site does not need auth at all but it's not a flat file, it's dynamic. The site does NOT time out when showing content.
2) I didn't try the HTTP response code, good idea, I'll try that.
3) My agent isn't being blocked - that site gets a ton of traffic from other people getting the same info.
4) I tried all the "sub" URL's (different parameters), all don't work and the URL's are correctly formed (been visiting that site for almost 2 years).
I'll try the response code. Lets assume it's 200. What would my next step be from there?
Thanks again for the help and the warm welcome!