Forum Moderators: coopster

Message Too Old, No Replies

How to get content of a web page with preg match

preg_match,php,geolocation

         

rodriguez1804

10:17 pm on Apr 19, 2010 (gmt 0)

10+ Year Member



Hey guys,

I ve been cracking my head on trying to get the users geolocation by doing the following:

if you visit [myiptest.com...]
it gives you a summary of your geolocation

I am using the function to get the same page and then try to take out certain info from the whole page, like the city, etc. However, I am having problems on the preg_match
and can't get the function to return anything at all!?


function getLocationCaidaNetGeo($ip)
{
$url = "http://www.myiptest.com/staticpages/index.php/IP-Lookup/".$ip;

if($handle = fopen($url,r))
{
ob_start();

fpassthru($handle);
$htmlContent = ob_get_contents();
ob_end_clean();

fclose($handle);
}
preg_match ("/City:(.*)/i", $htmlContent, $temp) or die("Could not find user city!");
$city[0] = $temp[0];
return $city[0];
}


Any help is appreciated!

CyBerAliEn

11:53 pm on Apr 19, 2010 (gmt 0)

10+ Year Member



Hmm... Interesting approach. I figure there'd be an easier way to do this (how is the other site getting the info?). I recall a day in time when I use to do something similar with Yahoo Finance to pull stock prices several years ago.

I would recommend doing the following:

function getLocationCaidaNetGeo($ip)
{
$url = "http://www.myiptest.com/staticpages/index.php/IP-Lookup/".$ip;
$content = get_file_contents($url)

if ($content!==false)
{
//The code here should run through 'content' and pull out the city and return it.
preg_match ("/City:(.*)/i", $htmlContent, $temp) or die("Could not find user city!");
$city[0] = $temp[0]; //huh?
$output = $city;
}
else
{
//Error: URL/page could not be pulled/retrieved.
die('Could not retrieve data.');
}

return $output;
}


Some quick thoughts...
(1) I would have your function return boolean false if the city cannot be determined; instead of "die" and an error message. I suppose this is my preference.
(2) I am not great with regular expressions, so I am not 100% sure what is wrong with your code, but I think it has to do with your pattern.

Your pattern appears to be looking for something in the form "City:%%%%%%%". But you must realize, that 'get_file_contents' (or your implementation) will return the HTML of the page. Regarding the city, the code looks like:
 <tr>
<td>City:</td>

<th>Glendale</th>
</tr>


As you can see... it is likely returning "nothing" because of the HTML coding for the table tags. You need to adjust the regular expression so that it looks for "City:" and then pulls out the 'city' defined in the following "th" tags. I am sorry that I am not able help you with this aspect.

rodriguez1804

12:40 am on Apr 20, 2010 (gmt 0)

10+ Year Member



Thanks a lot for the help CyBerAliEn.

You're right, there is an easier way to do this. I went ahead and implemented the geo location using MaxMind GeoCityLite. I guess I wanted to implement the above version so that I could get a better grasp on regex expressions, but I found a great tutorial for that too. So all is well. Thanks again!