Forum Moderators: coopster
href=\"[^h][^t][^t][^p][^:][^/][^/][^\"]+\", which works, but will fail on things like "href="attribute.html"", as there are t's where the regex doesn't expect them. Thus, this attempt can be ditched.
href=\"(^http:\/\/)[\"]+\", which doesn't work, and ends up matching only things like "href="httpandthensomemore.html""
and I couldn't really figure out more. I need a regular expression that only matches the WORD http, not each individual character, and I need it to match the whole word, as this is not used as a check for links starting with http, but for replacing relative links with absolute ones in the following piece of PHP code:
preg_replace("/href=\"<regex that works... hopefully>\"/", "http://address.com/\$1", $pieceOfHTML);
And I chose to post this here and not in the PHP section, as it's not concerning the PHP but the regular expression.
(Another less important issue, but still, if it can be addressed it would be nice, is that the expressions I came up with would require the length of the relative URL to be at least 8 characters, which is really a pain, so if you have a fix for that as well, I'd appreciate it.)
preg_replace("/<a\shref=\"([^(http)+].*)\">", "<a href=\"http://site.com/\$1">", $pieceOfHTML);
The only thing is that it only works with relative urls without a leading slash.
ie: page.htm works but /page.htm doesn't. The resulting url looks like this : [site.com...]