Forum Moderators: coopster
I have a string that contains one or many quotations between " and ". I'd like to add an italic style around this text but I have no idea about how to do that.
For example:
$str = "and he said: \"I am your father\""
magic_function($str) outputs "and he said: <i>\"I am your father\"</i>"
Can someone be so kind to help me?
Thank you a lot!
function italic_quotes($string) {
$pattern = '%(".+?")%s';
$replacement = '<i>$1</i>';
$string = preg_replace($pattern, $replacement, $string);
return $string;
}
[edited by: PHP_Chimp at 11:19 am (utc) on May 1, 2008]
$pattern = '%(".+?")%s'; ... should actually be $pattern = '%("[^"]+?")%';
The ()'s capture a " followed by any character that is not a " 1 or more times (the +). The ? after the + means that you will get the shortest matching patter.
So if you have -
he said "hello" then "bye"
You want a match of "hello" and "bye" not "hello" then "bye". So you need the shortest matching pattern.
$replacement = '<i>$1</i>';
The captured patterns are stored so you can refer to them. You can either use \1 or $1 for the first pattern, \2 or $2 for the second and so on for 99 patterns. It is suggested that we use $ version as there is then no confusion with some of the other escape sequences that all start with a \. That is the reason that you can get away with using ' around the replacement. As it is the regex engine that is substituting the $1 not php's string engine.
<edit>
The reason for the change in $pattern is that as the . matches any character the original will only work with a single set of "s the second should work with multiple quotes. It will still not work with nested quotes...although nested quotes should be single quotation marks, not double, if we are getting into gramma.
An improvement to this function would be to use curly quotes. So the below will turn the start and end quotes into nice curly ones (I am assuming that you are writing in English, but you could change the code to suit any language).
$pattern = '%"([^"]+?)"%';
$replacement = '“<i>$1</i>”';
[edited by: PHP_Chimp at 1:07 pm (utc) on May 1, 2008]
Very interesting your explanation. The trick with ? is great.
I have another question:
Can I also say to select all characters that are not included in other characters?
Always in my example let's assume there are formatting tags:
$string = "<div style=\"border:0px;\">He said \"Hello!\"</div>";
if I apply the function now I will obtain:
"<div style=<i>\"border:0px;\"</i>>He said <i>\"Hello!\"</i></div>";
Can I skip the occurences that are included in < > ?
Thanks!
The full solution would involve lookaheads however there is a poor solution that is a lot easier below.
The poor solution would be to look and check for <.+?> as this should match a tag.
$pattern = '%(?:<.+?>)?"([^"]+?)"(?:<.+?>)?%';
$replacement = '“<i>$1</i>”';
[edited by: eelixduppy at 4:25 pm (utc) on May 1, 2008]
[edit reason] disabled smileys [/edit]
print preg_replace($pattern, $replacement, "<div style=\"value\">text \"quote\"</div>");
I obtain still the wrong output.
But I got your idea: you will transform only text between " that is contained between <*> and <*>.