Forum Moderators: coopster

Message Too Old, No Replies

php reading the src of another page

         

electricocean

5:57 am on Nov 7, 2005 (gmt 0)

10+ Year Member



Hi I wondering it's possible to get the source of the site. It's forbbidden on my server becuase they don't want me todo that and get there server side code -- i think. Is there a way to get the browser side code like when youclick view source?

Thanks

Anyango

6:11 am on Nov 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hey

ob_start();
$page="http://anywebsite.com/anypage"
include $page;
$pageHtml=ob_get_contents();
ob_clean();

after execution of above code, the variable $pageHtml will have compiled html code ,(definitely, not the server source code), sent for the browser from that server. and you can easily manipulate it anywhere in your code.

Anyango

6:13 am on Nov 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code returned by this will be 100% exactly the same as you would get by loading that url in your webbrowser and then clicking "view source".

electricocean

1:47 am on Nov 8, 2005 (gmt 0)

10+ Year Member



Thanks.

I tried that. When I echo $pageHTML and when i use var_dump it outputs blank, but when i use print_r it outputs 1 whats wrong? also when I use var_dump it echos strong(0) at the top...

I put it with a fucntion:

function getsrc($file){
ob_start();
include($file);
$src = ob_get_contents();
ob_clean();
return print_r($src);//I keep switching between var_dump($src), print_r($src), and just $src)
}

NomikOS

1:54 am on Nov 8, 2005 (gmt 0)

10+ Year Member



and this?
return htmlspecialchars($src, ENT_QUOTES);

Anyango

4:41 am on Nov 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello There!

why would you need to use var_dump or print_r, print_r doesnt work on this simply because the returned object is a string, not an array. your function is perfectly allright, you just need to remove those calls for both print_r or var_dump

<?
function getsrc($file){
ob_start();
include($file);
$src = ob_get_contents();
ob_clean();
return $src;
}

echo getsrc("http://www.webmasterworld.com/forum88/10696.htm");

?>

I just tested it.

Regards

NomikOS

7:23 am on Nov 8, 2005 (gmt 0)

10+ Year Member



I insist with msg 6, but after testing htmlspecialchars cause problems with character "&"

# Definitily I suggest this:
function getsrc($file)
{
ob_start();
include ($file);
$src = ob_get_contents();
ob_clean();

# kill < and >
# ------------
$src = str_replace('<', '&lt;', $src);
$src = str_replace('>', '&gt;', $src);
# ------------

# I want lines as lines
# ---------------------
return nl2br($src);
}

By the way, would be nice a colored output, don't you think?
Who dare to do it in a few lines?

---

NomikOS

8:04 am on Nov 8, 2005 (gmt 0)

10+ Year Member



One moment...

This is for to see our source or the source of any page like suggest Anyango?
Mi solution is only for actions like: "Click here to see HTML code of this page"

---

Anyango

8:07 am on Nov 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



For me, that looks like simply extra processing and nothing else, if we read the original message it says

"Is there a way to get the browser side code like when youclick view source"

what i suggested was 100% exactly what he asked, atleast as what i apprehended, if he would have asked for nice colored output, that is possible too, but thats his part of work, he didnt ask for discussion on that , i think.

and why to kill < and > and why need lines seperately, i dont think he did ask for it, did he? and if it's question of number of lines, i could do all this in ONE line.

regards

ogletree

8:25 am on Nov 8, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't understand why you just don't click on view source.

electricocean

1:48 am on Nov 9, 2005 (gmt 0)

10+ Year Member



this still isn't working.

I am using it within a form so my code looks like this:

if(isset($_POST['showsrc'])){
$filename = $_POST['filename'];
$src = getsrc($filename);
if ($src){
$page = htmlspecialchars($src, ENT_QUOTES);
}
else{
$page = "Could not read source";
}
$num = $_POST['num'];
$date = $_POST['date'];
echo "Filename: {$filename}<br>Num: {$num}<br>Date: {$date}<br>Source: {$src}";
}

is something wrong with that?

Anyango

6:39 am on Nov 9, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In order to present grabbed html as html code to browser so that it can parse that and show in browser, you current code is perfect, however if you want to show the grabbed source code as plain text then you simply in your last line where you are doing "echo" just simply replace $src with $page

echo "Filename: {$filename}<br>Num: {$num}<br>Date: {$date}<br>Source: {$src}"; // this will be parsed by browser.

echo "Filename: {$filename}<br>Num: {$num}<br>Date: {$date}<br>Source: {$page}"; // this will be shown as plain text

electricocean

6:19 am on Nov 11, 2005 (gmt 0)

10+ Year Member



it still outputs "Could not read src..."

Anyango

1:08 pm on Nov 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am sorry but i am unable to find any reason of why it isn't working on your server, i have tested this code again and again during our discussion, i think there might be some issue with settings or this n that. Can any body else help us sort out this thing please?

ogletree

2:54 pm on Nov 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I am still confused why this is needed. Why can't you just hit view source on the page.

Anyango

2:56 pm on Nov 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Maybe electricocean needs that html for any processing or stuff like that, he can say better about that.