Forum Moderators: coopster

Message Too Old, No Replies

Regular Expressions PHP

Parsing out different elements

         

ukgimp

4:14 pm on Jan 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have adapted an example of a search engine which works ok at the moment. I would like to take it a stage further by extracting certain information of the page and storing as well (eg <h1>This Stuff</h1> etc) but I am falling down with the regex

I am getting errors like
Unknown modifier '(' in c:\phpdev.....

What I am using is:
preg_match_all("<title>([.*?])<\/title>" ,$buf,$words);

Then it loops through the words and inserts then into the db.

The system works fine when I just do it on all the words but that aint quite what I am after. Any suggestions.

Regards

Dreamquick

4:46 pm on Jan 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If memory serves I believe all brackets which aren't part of the reg exp stuff need to be escaped...

Does this work any better (ie no error message)?

preg_match_all("<title>\([.*?]\)<\/title>" ,$buf,$words);

-Tony

dingman

4:56 pm on Jan 15, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



PHP's Perl-compatible regular expression functions need the braketing characters you would use in Perl to set off the expression. The first character is therefore interpreted as the delimiter, and from that the closing delimiter is deduced. Anything after the closing delimiter is a pattern modifier. In this case, your first character is '<', so the pattern ends with the first '>'. The '(' that follows it would therefore be a modifier, but it's not a recognized one.

To get the effect you want, try
preg_match_all("{<title>([.*?])</title>}" ,$buf,$words);
or
preg_match_all("/<title>([.*?])<\/title>/" ,$buf,$words);

Hope that helps

-Andrew