Forum Moderators: coopster

Message Too Old, No Replies

Change Links

urlencode a url

         

thewebboy

5:08 am on Jul 13, 2003 (gmt 0)

10+ Year Member



I have a document that contains a lot of url's and I need to change each link.

For example: http*//www.example.com/?sdfsdf=sdfsf
needs to become: go.php?url=http%3A%2F%2Fwww.example.com%2F%3Fsdfsdf%3Dsdfsf

So, Im using preg_replace and urencode so I tried:

$datax = preg_replace ("<a href=\"(.*?)\">", "<a href=\"go.php?url=" . urlencode($1) . "\">", $data);

and unfortunately I can't get it working, any idea?

vincevincevince

7:28 am on Jul 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



preg_replace won't quite work like that i'm afraid - you've got to do it as preg_match, preg_replace

while (preg_match("/\<a href=\"(.*)\"\>/",$data,$matches))
$datax=preg_replace($matches[1],"go.php?url".urlencode($matches[1]),$data)

(
while we can match the <a href="something"> tag, we'll replace the something with "go.php?url".urlencode(something)
)

notes:
- be careful about relative paths, eg if you already have <a href=\"about.htm\"> you _may_ not want that to be changed (?)
- the above solution will also (buggily) change URLs written in plaintext IF they match exactly the url which is within a link. am sure you can figure the changes needed.

thewebboy

8:12 am on Jul 13, 2003 (gmt 0)

10+ Year Member



Thanks for you help.

I did a little testing and it took at lease 5 minutes for the script to execute. I recived this message: "Warning: Delimiter must not be alphanumeric or backslash" for line 11 which is:

$datax = preg_replace($matches[1],"go.php?url".urlencode($matches[1]),$data);

any idea whats going on?

vincevincevince

8:30 am on Jul 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sorry - it was early morning in the UK


$datax=$data;
while (preg_match("/\<a href=\"(.*)\"\>/",$datax,$matches))
$datax=str_replace($matches[1],"go.php?url".urlencode($matches[1]),$datax);

that should work, i know why it ran for so long as well - it should also be fixed now.

NOTE: code is wrong, see later post by me

[edited by: vincevincevince at 10:57 am (utc) on July 13, 2003]

ruserious

8:59 am on Jul 13, 2003 (gmt 0)

10+ Year Member



I think this will generate an infinite loop, won' it? The replacements will again be matched by the preg_match. Or am I missing something?

I think you can do it with preg_replace alone. I'll have a look...

thewebboy

9:02 am on Jul 13, 2003 (gmt 0)

10+ Year Member



still having problems, it takes a really long time to execute and times out.

ruserious

9:10 am on Jul 13, 2003 (gmt 0)

10+ Year Member



Nevermind, I totally overlooked the urlencode, sorry.
You could do it with preg_match_all, too, however since you have to have a loop to replace the occurences you wouldn't really gain something (if vincevincevince's code does terminate).

vincevincevince

10:54 am on Jul 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



sorry again! i forgot, i always do this a different way that doesn't make this problem....


$temp=$data;
while (preg_match("/\<a href=\"(.*)\"\>/",$temp,$matches))
{
$datax=str_replace($matches[1],"go.php?url".urlencode($matches[1]),$datax);
$temp=substr($temp,strlen($matches[0]));
}

(slaps self for being so stupid!)

thewebboy

11:50 pm on Jul 13, 2003 (gmt 0)

10+ Year Member



Unfortunately it now grabs all the tags at once.

Here is the fix:

while (preg_match("/\<a href=\"([^\"]*)\"\>/",$temp,$matches)){

$datax=str_replace($matches[1],"go.php?".urlencode($matches[1]),$datax);
$temp=substr($temp,strlen($matches[0]));
}

weeno

4:57 am on Aug 7, 2003 (gmt 0)

10+ Year Member



hey all

I found this forum/thread on a google search. I used your code... but thought I'd add my corrections:

$temp=$data;
$datax=$temp;
while (preg_match("/\<a href=\"([^\"]*)\"\>/",$temp,$matches)){

$datax=str_replace($matches[0],"<a
href=\"http://www.yoursite.com/c.php?u=".urlencode($matches) . "\">",$datax);
$pos=strpos($temp,$matches[0]);
$temp=substr($temp,$pos + strlen($matches[0]));
}

I found that it was matching too many times. It wasn't incrementing the substr enough. so it would match the same string multiple times. Plus I replaced on $matches[0] -- which is the entire match:

entire match: <a href="http://www.whatever.com">

If you used $matches[1]... which is just

http://www.whatever.com

then you'll get some weirdness if your original $data has plain view links. For example:

For more information see <a href="http://www.test.com">http://www.test.com</a>

With the original code, it would replace BOTH the actual link and the text for the link.

anyhow... thanks for your help. Hope this helps too.

[1][edited by: jatar_k at 5:01 am (utc) on Aug. 7, 2003]
[edit reason] delinked [/edit]

vincevincevince

8:59 am on Aug 7, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




$text=preg_replace("/(\<[^\>]*a[^\>][^\>]*href[^\>]*=\s*[\"']?)([^\>]*?)([\"']?[^\>]*\>)([^\<]*)(\<[^\>]*\/a[^\>]*>)/ie","'$1http://mysite.com/goto.php?url='.urlencode('$2').'$3$4'",$text);

how's that for size?