get external page title

Forum Moderators: coopster

Message Too Old, No Replies

get external page title

surrealillusions

3:30 pm on Feb 14, 2011 (gmt 0)

Hi all,

How can something so simple be so complicated?

I've tried what seems every script and function out there on the internet to find the page title of an external website.

Is there any method that actually works?

rocknbil

5:09 pm on Feb 14, 2011 (gmt 0)

How are you getting the page/file?

If it returns in a string, it should be as simple as

$the_title = preg_replace('/<title[^>]?>([^<]+)<\/title>/i',"$1",$the_page_string);

<title followed by anything not a > (in case there are attributes)
followed by anything not a <, which we store in $1
followed by </title>
i = case insensitive

Might need [^>]* instead of [^>]? but don't think so.

surrealillusions

6:26 pm on Feb 14, 2011 (gmt 0)

Found one that works. Seems kinda similiar to the one you've posted.

if(preg_match("/<title>(.+)<\/title>/i",$file,$m))
echo "$m[1]";
else
echo "The page doesn't have a title tag";

rocknbil

6:07 pm on Feb 15, 2011 (gmt 0)

If it works ,it works, and yes, it is. :-) The only thing I would question is if it will slurp up more than you expect in some conditions. "." means "any character," + means "one or more," so "one or more of any character" could include < and /. This means "one or more of any character not a <"

[^<]+

So is a bit more specific. There is a possibility (however slim) there may be attributes inside <title> which is the only reasoning for the other bits there.

But if it works it works. :-)