Forum Moderators: coopster
I'm trying to tidy up some text, specifically involving users pasting multiple paragraphs of text from Word. I need to check for and remove any \t's used for indenting in Word at the start of paragraphs.
Since I want to focus on the stuff between paragraphs, my regex includes a word boundary at the end. But, what should my replacement be? Here's my code:
$abstract_db = preg_replace('/\n\t+\w+/','\n\n',$abstract_db);
Basically, it looks for a newline followed by a tab that's also followed by a word. It's supposed to replace the newline-tab with newline-newline and hopefully preserve the word (somehow).
The script is now cutting off the first word of the second paragraph, and I know why: the replacement is the problem. But, what should I specify in the replacement part to "retain" the first word? Would \w+ in the replacement know to use the same text as what it found by \w+ in the search?
Is this where $1 and $2 come in?
Thanks!
Glenn
seems you may be able to use "str_replace() or ereg_replace() to search the string(s) for tabs and replace them with 3 or 4 no-break-spaces. "