Forum Moderators: coopster

Message Too Old, No Replies

Need explanation

         

Mister_L

11:46 pm on Aug 25, 2009 (gmt 0)

10+ Year Member



This is an example from the php manual.

The following code should output www.php.net:

<?php

preg_match('@^(?:http://)?([^/]+)@i',
"http://www.php.net/index.html", $matches);
$host = $matches[1];
echo $host;
?>

If someone could explain the all the details of how it works I would really be grateful.

Thanks.

eelixduppy

11:55 pm on Aug 25, 2009 (gmt 0)



Essentially you are matching a string against a pattern that you specify. In this case, the pattern here is:
@^(?:http://)?([^/]+)@i
.

There are numerous rules to making these patterns, so perhaps it's best that you read the documentation on this and ask specific questions that I'll be happy to answer for you.

You can find the pattern syntax here: [php.net...]

Hope that gets you started. :)

Mister_L

10:33 am on Aug 26, 2009 (gmt 0)

10+ Year Member



I'll look into the details of regular expressions.

However, what I really don't understand here is the meaning of $matches[1].Could you clarify it?

Mister_L

1:29 pm on Aug 27, 2009 (gmt 0)

10+ Year Member



The manual says that $matches[1] should capture the first parenthesized subpattern.Ok,so "http://" doesn't count because it starts with ?: (right?), then you have a question mark and then you have /, followed by another parenthesized subpattern,i.e "index.html".

Why does $matches[1] gets the value of the question mark in between?

Thanks.

coopster

3:35 pm on Aug 28, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Ok,so "http://" doesn't count because it starts with ?: (right?)

Correct.

Why does $matches[1] gets the value of the question mark in between?

It's not the question mark in between, that is just telling the regular expression parser that the http:// is optional. The next set of capturing subpattern is

([^/]+)
, which is telling the engine to capture one or more of anything that is not a slash character.

Basically you are grabbing the host/domain portion of the uri.