Forum Moderators: coopster

Message Too Old, No Replies

read external webpage

read external webpage readfile file_get_contents

         

learnit

6:12 pm on May 5, 2005 (gmt 0)

10+ Year Member



I made the folowing script. It works in most cases but not in the one below. I tried several solutions, but I still don't get the content of that website.
<?php
$url="http://www.example.nl/";
aspxerrorpath=/Default.aspx";
$arr =file_get_contents($url);

print($arr);
readfile($url);
$array = file($url);
print count($array);
?>

Who can help

[edited by: jatar_k at 6:15 pm (utc) on May 5, 2005]
[edit reason] generalized url [/edit]

jatar_k

11:26 pm on May 6, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Welcome to WebmasterWorld learnit,

if it works most of the time what is different with this one aside from the url?

does it give you errors?

learnit

6:02 am on May 7, 2005 (gmt 0)

10+ Year Member



When I look at for instance www.example.com/services/analyze/. They can read and analyse the webpage. So when they can do it, it's possible, only how do they do it?

[edited by: jatar_k at 5:32 pm (utc) on May 7, 2005]
[edit reason] generalized url [/edit]

jatar_k

5:31 pm on May 7, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



so do you just mean it doesn't work from your site then?

It could be just the settings for php with your host

from
[php.net...]

Tip: You can use a URL as a filename with this function if the fopen wrappers [php.net] have been enabled.

learnit

6:06 pm on May 7, 2005 (gmt 0)

10+ Year Member



No it works fine for most webpages. For instance when i use Curl i still get the same problems with www.vbo.nl. It gives a errormessage. Could it be because the use cookies and (i think ) also sessions?

jatar_k

6:51 pm on May 7, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



That could be, it is always hard to say exactly because you don't always know what is going on in the background.

I would also try using a different user agent in cURL, it could be something as simple as user agent redirection.

I tried blocking cookies on that site and it still worked so I am not sure cookies is the answer.

learnit

7:17 pm on May 7, 2005 (gmt 0)

10+ Year Member



How does your script looks like then?

Looking with IE it alle works fine. I also used iehttpheaders v1.6 to see underwater what is happening. Still can't get the content with curl

learnit

7:25 pm on May 7, 2005 (gmt 0)

10+ Year Member



That part of useragent did the tric.

I quess it works now (i do get code, also an error, but it's much more than before).

wanne see the code?

learnit

7:46 pm on May 7, 2005 (gmt 0)

10+ Year Member



www.alternate.nl doesn't work (any more) then. strange.

jatar_k

10:20 pm on May 7, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



mostly I just hit my own site with a couple of browsers and use whatever user agent I show. It will never be foolproof, but if you have 5 or 10 in your arsenal then you are usually pretty close.