Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Parsing php site

Parse the page for links



2:04 pm on Dec 9, 2009 (gmt 0)

5+ Year Member


What I would like to do is for example after all the php code is finished (so the finall html page has been rendered and ready to be send to browser) I would like to parse the page for links. For example I would like to change all the email links to images or add a pdf pic in front of all pdf links...

The thing is that I don't know how is this actually done, what is the right aproach. At the moment I have a index page where I include other php (sub)pages (which may also have some php code). And in the index.php page is also a lot of php code. I don't want to parse the php files before the code was executed.

So What I would like to do is, that after all the php code to render the page was executed, I would like to parse it and serch for links and than do something with them.

How to do this?


5:14 pm on Dec 9, 2009 (gmt 0)

10+ Year Member

ob_start, ob_get_contents and then preg_replace should do the trick!


1:43 pm on Dec 16, 2009 (gmt 0)

5+ Year Member

So if I run the following code, what I'm going to get in $string is going to be just HTML (php scripts will be executed)? I will not find any "<?php" in there, will I?

$string = get_include_contents('statusi_dijakov.html');

//echo $string;

function get_include_contents($filename) {
if (is_file($filename)) {
include $filename;
$contents = ob_get_contents();
return $contents;
return false;


3:44 pm on Dec 16, 2009 (gmt 0)

5+ Year Member

Your function appears to do what you want.

Given the filename, your function will include the file. It will process any and all PHP. And what gets outputted will get "caught" by the output buffering. This is then returned from the function. So what you get out of your function is the output of your PHP file, which in your case should be pure HTML.

Once you have this HTML in the variable "string", you can further parse/manipulate it or output it to the user.

Since you want to add PDF images to the front of PDF links, you can then add code to loop through the HMTL content looking for PDF links and inserting the image code into the original string. This will require some minimal thinking to setup the coding, but is relatively simple.


9:43 am on Dec 18, 2009 (gmt 0)

5+ Year Member

Thanks so much!

I think there shouldn't be many problems with further parsing of the string. I was just not sure about the first part (capturing the output of included file/-s).


12:33 am on Dec 19, 2009 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

Do measure the performance when using this. How long does it take from browser request, to page starting to be rendered on screen, both with and without the additional function running?

Featured Threads

Hot Threads This Week

Hot Threads This Month