Forum Moderators: coopster

Message Too Old, No Replies

Domain matching

         

ocon

5:27 pm on Nov 10, 2010 (gmt 0)

10+ Year Member Top Contributors Of The Month



I'm trying to find if a part of a domain from a $url matches a list.

$domain = "." . strtolower( parse_url( $url , PHP_URL_HOST ));
if( substr( $domain , 1 , 5 ) == "www." ) $domain = substr( $domain , 4 );

$array = ( ".example.com" , ".sub.domain.com" , ".edu" , ".gov" );

if( in_array( $domain , $array )){ matches, do something... }
else{ doesn't match, do something else... }


This would match urls like:

http://www.example.com/
http://example.com/


But I would like to modify the script to also match:

http://sub.example.com/
http://www.sub.example.com/
http://anothersub.sub.example.com/
http://www.brown.edu/
http://www.yard.edu/
http://www.cdc.gov/
http://www.fbi.gov/


But not:

http://www.domain.com/
http://domain.com/
http://notexample.com/
http://www.somethingelse.com/

ocon

12:24 am on Nov 11, 2010 (gmt 0)

10+ Year Member Top Contributors Of The Month



OK, I have a better idea on how to match domains, but I still need some help.

$array = ( "example.com" , "sub.domain.com" , "edu" , "gov" );
$url = "." . strtolower( parse_url( $url , PHP_URL_HOST ));

1. What I would like to do is to strip all leading periods (if any) before each item in the array. (In the array above there are none.)

2. Add one period in front of each item in the array. (Sure, it undoes what step 1 might have done, but it ensures that all the items are consistent.)

Then

if(substr($url,-1*strlen(item in $array)) == item in $array) { matches, do something... }
else{ doesn't match... }

Readie

9:13 am on Nov 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why not just use a regular expression?


if(preg_match('/(?:(?:http:\/\/|www\.)|http:\/\/www\.)(?:[a-z\d]+\.)*([a-z][a-z0-9\.\-]{2,63})(\.[A-Za-z]{2,6}){1,2})(\/[^\/]+)*\/?/im', $input, $out)) {
if(in_array($out[1], $list_of_domains) {
// We found your domain
$url = parse_url($out[0]);
}
}


This might not be working code, I typed it on the fly without testing it - it's just to give you the idea.