homepage Welcome to WebmasterWorld Guest from 54.205.254.108
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
perl script replace links in text
marryshelly




msg:4002122
 2:24 pm on Oct 6, 2009 (gmt 0)

Hi i want to replace some links in some text they are either <img src="" /> and <a href="" />

i have these following path i want to replace:

href="/faqs/default.aspx"

with this virtual directory path

href="/en/faqs/default.aspx"

can anyone help me ?

 

rocknbil




msg:4002432
 8:38 pm on Oct 6, 2009 (gmt 0)

Welcome aboard marryshelly, typing this out on the fly and tested, it should get you started.


#!/usr/bin/perl -w
$my_replacement='/faqs/default.aspx';
@tests = (
'This is an empty link <a href="">testme</a>.',
'This is a real link <a href="http://example.com">testme</a>.',
'This is a link using single quotes <a href=\'http://example.com\'>testme</a>.',
'This is a 4.01 image <img src="http://example.com/test.jpg">.',
'This is an XHTML image <img src="http://example.com/test.jpg" />.'
);
foreach $line (@tests) {
if ($line =~ /\<.*img[^>]+¦href[^>]+\s*\/*\>/i) {
$line =~ s/(<.*img[^"']+¦href[^"']+)(['"]*)([^"']*)(['"]*\s*\/*\>)/$1$2$my_replacement$4/ig;
print "New line is now \"$line\"\n";
}
}

Note that:
- This message board borks the pipe character, the one you get if you hold the shift key and type \. Be sure to replace the double-pipes in the code above with the actual pipe.

- Use XHTML syntax in your code only if you require XHTML features, and your server outputs text/xhtml headers, most output text/html by default (i.e., don't misrepresent your document type.)

[edited by: phranque at 9:21 pm (utc) on Oct. 6, 2009]
[edit reason] disabled graphic smileys ;) [/edit]

marryshelly




msg:4002756
 7:40 am on Oct 7, 2009 (gmt 0)

Hi rocknbil, thanks for the quick reply it realy helps me to fix my problem.

marryshelly




msg:4002773
 7:55 am on Oct 7, 2009 (gmt 0)

I have another issue regarding my site which has virtual directories.

it has
en = english
es = spanish
de = german

etc.

What can i do if i am living on a directory which is mensioned above with different root levels with english spanish etc.

How to add that to existing links so
the link i have with (img or a)
href="content/something/default.aspx"
src="content/something/me.jpg"

to look like this
href="en/content/something/default.aspx"
src="en/content/something/me.jpg"

I will be pleased if someone could help me out with this!

i have to also mention that links are in some Text that i want to modify!

phranque




msg:4002892
 12:25 pm on Oct 7, 2009 (gmt 0)

welcome to WebmasterWorld [webmasterworld.com], marryshelly!

it would be easier to write a simple yet suitable regular expression for your problem if you described in more detail what the text patterns look like.
for example:
are you always using double quotes for quoting your href/src attribute values?
are the links in the "text you want to modify" fully qualified urls, as in http://(subdomain.)?example.tld/path and if the protocol specifier isn't specified how would you be able to recognize it as a url?

have you looked at the regular expression references and tutorials linked in the Perl Server Side CGI Scripting forum Charter [webmasterworld.com]?
while you are there note the item about "Do My Homework posts".
=8)
it would be preferable if you showed your best attempt at coding this and described the incorrect results.

marryshelly




msg:4002920
 1:21 pm on Oct 7, 2009 (gmt 0)

As rocknbil showed in his example, it has some text and that text string can hold some hyperlinks and images which i want to change the paths for.

some @text = "<p>Some text in the editor <a href="/content/images/background.gif" target="background">/content/images/background.gif</a></p>
<p><img height="289" alt="" src="/content/images/background.gif" width="786" border="0"></p>
"

And yes the link attributes are double quoted and they do not hold fully qualified urls but root level paths as in the @text.

can someone shed some light on this?

marryshelly




msg:4002940
 1:54 pm on Oct 7, 2009 (gmt 0)

The problem is that, the site is generated in CMS system which does not know the site struture.

if we say my site looks like http://www.example.com/ with virtual directories (en, en, de). And every directory has there own language content.

for example if you goto my site you will be able to chose different language content for different countries.

so it will look like this
http://www.example.com/en/ - english
http://www.example.com/es/ - spanish
http://www.example.com/de/ - german

When i generate different content in my CMS system and somehow add language to my root level "en/content/etc"?

any idea how to ?

[edited by: phranque at 9:44 pm (utc) on Oct. 7, 2009]
[edit reason] examplified domains [/edit]

marryshelly




msg:4005550
 1:54 pm on Oct 12, 2009 (gmt 0)


System: The following message was spliced on to this thread from: http://www.webmasterworld.com/perl/4005548.htm [webmasterworld.com] by phranque - 3:09 pm on Oct. 27, 2009 (utc -7)


Hello, anyone who can help me ?

i have this
====================================

<iw_perl>
my $tests = "some text with links <a href="/design/images/default.aspx" > Link </a> and some images in this string <img src="\design\images\dog.jpg" />";

my $pattern = array(" href="\/"," src="\/")
my $replacement = array(" runat="server" href="~\/"," runat="server" src="~\/");

my $result = preg_replace($pattern,$replacement, $tests);

print $result;

</iw_perl>

====================================

how can i replace a link with runat="server" attribute and add this to href="~\" ? any solutions ? anyone who can guide me?=?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved