Forum Moderators: coopster

Message Too Old, No Replies

RegEx Problem

         

rfontaine

8:25 pm on Apr 20, 2005 (gmt 0)

10+ Year Member



Hi all,

I have a regular expressions problem I cannot seem to figure out with my limited knowledge of the subject.

I have a web page with a number of URL's and would like to replace spaces with underscores using regular expressions in PHP like this for example:

[example.com...] page/
becomes [example.com...]

Any ideas?

Thank you in advance,
Ron

StupidScript

11:48 pm on May 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm wondering what requires regexp in your example?

Your example could easily be handled in PHP by using:

$thisurl=str_replace(" ","_","http://www.example.com/this page/");

or even:

$thisurl=str_replace(" ","%20","http://www.example.com/this page/");

to substitute the spacebar entity.

ergophobe

3:48 pm on May 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I suspect the problem is that he doesn't know what or where the URL is and he wants to do a bunch where he corrects the href part. Is that right Ron?

So you have something like:


blah blah blah <a href="http://example.com/page 7>old page</a> blah blah blah or <a href="http://example.com/page 2354>new page</a>

Is that the situation? In that case



function stripSpaces ($matches)
{
return $matches[1] . str_replace(' ', '_', $matches[2]);
}
$pattern = '/(href=")([^"]+)/i';
$new = preg_replace_callback($pattern, "stripSpaces", $string);


This will only work if the URLs are in tags and they have matched quote marks. If they are in tags and have no quote marks and no other attributes (i.e. no class or onclick or anything like that), you could use a pattern such as

'/(<a href=)([^>]+)/i'

If you just have a list of URLs each on its own line, then that's fairly simple too. The your pattern could look like this
$pattern = '`(http://)(.*)`i'; // back ticks b/c of / in pattern

If you have something like this though


blah blah blah [example.com...] 7/ is a good page

You're in trouble since you would need artificial intelligence to know where the URL is supposed to end unless they all terminate with something known that won't appear elsewhere (like ".htm" or ".php").