Forum Moderators: coopster & phranque

Message Too Old, No Replies

Getting a random word from a string.

         

adni18

12:59 am on Nov 27, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi. I would like to know how to, preferrably in regexp, get a random word in a string, making sure that the word is not enclosed in a <script> or <noscript> <!-- tag. Could someone please help me?

DrDoc

9:51 pm on Nov 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Assuming the string is contained in $_:

s{<script[ >].*?</script>¦<noscript[ >].*?</noscript>¦<!--.*?-->}{}isg;
s{<.*?>(.*?)</.*?>}{}sg;
split(/\b/);
$randomword = $_[rand($#_ - 1)]

adni18

10:52 pm on Dec 2, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm sorry, but that doesn't work on my system. Am I incorporating this into my script the right way? BTW i did replace the broken pipes.

#!/usr/bin/perl

use CGI::Carp qw(fatalsToBrowser);

$context="hi this is a test";
print "Content-type:text/html\n\n";
&getRandom($context);

sub getRandom {
s{<script[ >].*?</script>¦<noscript[ >].*?</noscript>¦<!--.*?-->}{}isg;
s{<.*?>(.*?)</.*?>}{}sg;
split(/\b/);
$randomword = $_[rand($#_ - 1)];
print $randomword;
}

DrDoc

10:19 pm on Dec 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It should work, except for one thing -- add the following line to your sub routine:

[perl]sub getRandom {
$_ = $context;
...
}[/perl]

DrDoc

10:24 pm on Dec 3, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, one more thing...

In the split function the string is split by

\b
, which is a "word boundary". This, however, results in spaces and such being captured as their own 'words'. You may want to replace
\b
with an actual match of word delimiting characters you want to target. Perhaps something like this might work better:

split(/[^[:alnum:]]+/);

And, while you're at it... the

rand()
function should actually be:

$randomword = $_[rand($#_ + 1)];