Forum Moderators: coopster

Message Too Old, No Replies

Yet, another ereg() expression

Need to match URL (both online and localhost)

         

tomda

9:06 am on May 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Any help from regular expression guru will be highly appreciated.

Just want a script to check that URL is valid as the data is passed in hidden form ($_POST) and match our base url (both local and online).

The code below match the first example but not the two other examples.

Thanks

// CHECK URL
// *********
function check_url($url) {if(eregi("^http\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)?$", $url)) {return TRUE;} else {return FALSE;}
}

$back_url="http://www.example.com/";
// $back_url="http://localhost/www.example.com/";
// $back_url="http://www.example.com/eng/data.php";

if(!check_url($back_url)) {echo "DO NOT MATCH";} else {echo "MATCH";}

coopster

4:17 pm on May 25, 2006 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Your regular expression will miss on
localhost
because of the lack of a period prior to the first slash after
localhost
.

The second test actually will match once the pattern is corrected ( you should be using a character class rather than parentheses around the optional trailing part of your expression --

[/\S]*
). That says to find zero or more slashes or non-space characters at the end of the expression.

The best way to troubleshoot regular expressions is to use the PCRE [php.net] functions and capture subpatterns. You will quickly be able to spot what is going on. Example:

function check_url($url) 
{
if (preg_match("#^http\://([a-zA-Z0-9\-\.]+)(\.)([a-zA-Z]{2,3})([/\S]*)$#", $url, $matches)) {
print '<pre>'; print_r($matches); print '</pre>';
return true;
} else {
print '<pre>'; print_r($matches); print '</pre>';
return false;
}
}

I think your bigger issue is how to have the pattern match whether you are running on your test box or your production box without having to modify your expression for each. There are a couple of ways to do this. First, I find it much easier to setup VirtualHost containers in the Apache configuration and then modify your local machines host file to point to your internal server rather than the external web site. This gets rid of the whole "localhost" issue. If that is not an option, you could always use a $_SERVER variable in the first part of your regex. $_SERVER['SERVER_NAME'] or $_SERVER['HTTP_HOST'] perhaps.

tomda

5:57 am on May 26, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Coopster for your great explanation and the use of preg_match to find out what's going on. Works great..
I 'll go through my books over the week-end to come with something that matches both URL.

Regarding modyfing Virtualhost, sadly I can't do it in my host, unless it is done with ini()... Anyway, as you said, I will use $_SERVER variable to do the job.

Thanks again.
Tomda