Forum Moderators: coopster

Message Too Old, No Replies

Using the title of an included page

How do you dynamically pull the title from a different page?

         

geckofuel

1:18 pm on Mar 14, 2003 (gmt 0)

10+ Year Member



What is the easiest way to dynamically pull the page title off of a different webpage in PHP?

jatar_k

4:02 pm on Mar 14, 2003 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



You should probably open it and then match what is between the <title></title> tags using one of these.

Regular Expression Functions (POSIX Extended) [php.net]
Regular Expression Functions (Perl-Compatible) [php.net]

nosanity

4:40 pm on Mar 14, 2003 (gmt 0)

10+ Year Member



This should work

preg_match("/<title>([\w\s]*)</title>/i", $webpage_contents, $matches);
$webpage_title = $matches[1];

noSanity

geckofuel

9:16 pm on Mar 14, 2003 (gmt 0)

10+ Year Member



nosanity,
How would you define $webpage_contents? How do you pull the contents into a variable? The include statment merely returns a boolean value indicating whether the include was successful.

preg_match("/<title>([\w\s]*)</title>/i", $webpage_contents, $matches);
$webpage_title = $matches[1];

nosanity

9:20 pm on Mar 14, 2003 (gmt 0)

10+ Year Member



Well, you would have to read the different web page into a string, i just called the string $webpage_content as an example.

For example, you could read the LOCAL page by using this:


$filename = "thispage.html";
$file_handler = fopen($filename);
$webpage_contents = fread ($file_handler, filesize ($filename));
fclose($file_handler);

And I noticed a bug in the code. It is missing 1 thing. Beside \w, add \.\-\+ This will ensure those characters will be matched as well.

noSanity

andreasfriedrich

9:35 pm on Mar 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



nosanity´s RE would not match a title like Aaron C. which I find rather sad since it is a perfectly valid and keyword rich title ;).


preg_match [php.net]("'<title>([^<]*?)</title>'",
implode [php.net]('', file [php.net]('http://www.aaron.com/')),
$m);
$title = $m[1]

In PHP [php.net] 4.3+ you could save the implode [php.net] and do something like this:


preg_match [php.net]("'<title>([^<]*?)</title>'",
file_get_contents [php.net]('http://www.aaron.com/'),
$m);
$title = $m[1]

This will work even for Aaron.

<added>I just read that nosanity changed his code. However, the perfectly valid title of Aaron &amp; Nick still would not match ;-)</added>

Andreas

nosanity

9:48 am on Mar 15, 2003 (gmt 0)

10+ Year Member



Yeah, I can't say I remember all the syntax... heh... but you are right. Testing for the next \< would match all characters within the title tag. I hate when I have to admit when I am wrong, but (hangs head in shame) I am so very wrong. *pout*

noSanity

geckofuel

4:20 am on Mar 17, 2003 (gmt 0)

10+ Year Member



Excuse my ignorance, but I'm trying to come up with the regexp that is completely inclusive. Consider the following:

preg_match("'/<\/head>([.*]*)Powered by <a href/'", $webpage_contents, $matches);

What should be between the parenthesis ([.*]*) to make this statement all inclusive?

nosanity

9:13 am on Mar 17, 2003 (gmt 0)

10+ Year Member



This would probably work a bit better...

preg_match("'/.*<\/head>(.*)Powered by \<a href/', $webpage_contents, $matches);

In this case, you do not need to specify which characters you want, because you want everything. So you can eliminate the []'s

noSanity

andreasfriedrich

11:00 am on Mar 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>completely inclusive

So you are no longer trying to match the title but the body element?

>>preg_match("'/<\/head>([.*]*)Powered by <a href/'",

This will match /</head>, then greedily .* followed by Powered by <a href/. The slash before the closing head tag will never occur in valid HTML and neither will href followed by a slash.

The first sign of the pattern string is cnsidered the pattern delimiter. In this case it is the single quote. Not using the slash when matching html tags has the advantage that you do not have to escape the slash in your pattern. So it is always wise to choose a delimiter that is not in your pattern.

HTH Andreas

geckofuel

1:23 pm on Mar 17, 2003 (gmt 0)

10+ Year Member



Never mind...I started a new thread:

[webmasterworld.com...]