Welcome to WebmasterWorld Guest from 54.146.180.94

Forum Moderators: coopster & jatar k & phranque

rss from perl script

creating a rss file from perl

   
11:27 am on Oct 30, 2008 (gmt 0)

5+ Year Member



hello again, my second topic,

I am working on a rss feed, from perl, which takes all of the data from a log...else where.

this is the script:


foreach my $species ($tree->find_by_tag_name('log')){

my $revision = $species->find_by_tag_name('logentry')->attr_get_i('revision');
my $action = $species->find_by_tag_name('path')->attr_get_i('action');
my $actionContent = $species->find_by_tag_name('path')->as_text;
my $editBy = $species->find_by_tag_name('author')->as_text;
my $dateMod = $species->find_by_tag_name('date')->as_text;
my $msg = $species->find_by_tag_name('msg')->as_text;

foreach my $usee ($use->find_by_tag_name('log')){

my $revisionCompare = $usee->find_by_tag_name('logentry')->attr_get_i('revision');
if($revisionCompare > 0 ){
open my $logfile, ">", $file or die "Failed to open rss file, error: $!";
print $logfile $info."\n";
print $revision."\n".$action."\n".$actionContent."\n".$editBy."\n".$dateMod."\n".$msg;
}

}
}

All this does is take the info, and make variables out of them, so how can I now create a rss file?

I hope somebody can help me.

greetings.

9:06 am on Nov 5, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



you should look at the XML::RSS perl module on cpan [search.cpan.org].
5:21 am on Nov 15, 2008 (gmt 0)

5+ Year Member



Hi,

If you are using foreach it'll do the process in a array form, You can use while instead of foreach here.
It'll process line by line...

11:08 am on Dec 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Funny, I just wrote a script to do this last week. Once you've got your variables into an array, just build an output string, and write it to a file.

$output = q[<?xml version="1.0" ?>  
<rss version="2.0">
<channel>
<title>My Site Updates</title>
<link>http://example.com/</link>
<description>Blah blah blah blah blah.</description>
];

for $counter (1..15) {
($date,$title,$link,$descrip) = split("\t",$data[$counter]);
$descrip =~ s¦<a href.*>?¦¦g; # strip out HTML tags
$descrip =~ s¦</a>¦¦g; # strip out HTML tags
($month,$day,$year) = split('-',$date);
if (length($day)==1) {$day="0$day";}
$year+=2000;
$output.= qq[
<item>
<title>$title</title>
<link>http://example.com$link</link>
<description>$descrip</description>
<pubDate>$day $months[$month] $year 08:00 PST</pubDate>
</item>\n];
}
$output .= '</channel></rss>';

open (FILE,'>rss.xml') ¦¦ die $!;
print FILE $output;
close (FILE);
2:19 am on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A couple of updates to the above code: This forum changes the solid bar character to a broken bar character (), so you'll need to change that in the code above.

Next, as I discovered in another thread, it's not necessary to strip out the HTML tags. Instead you can just encode them. And while you're at it, you can encode other unsafe characters:

$descrip =~ s/&!(amp;)/&amp;/g;
$descrip =~ s/'/&apos;/g;
$descrip =~ s/"/&quot;/g;
$descrip =~ s/</&lt;/g;
$descrip =~ s/>/&rt;/g;

[edited by: MichaelBluejay at 2:24 am (utc) on Dec. 17, 2008]

[edited by: phranque at 6:22 am (utc) on Dec. 17, 2008]
[edit reason] disabled graphic smileys ;) [/edit]

7:39 am on Dec 17, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



!(amp;)

while it would be handy, that isn't a valid regexp as far as i know.
you can non-match single characters, even from a class list, but not a string.

- i would suggest unencoding the string before re-encoding.
- perhaps you can use "if ($descrip !~ /&amp;/)" to exclude that case but it would be trickier to do global substitutions with multiple ampersands in the string that way.

4:39 pm on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay, I looked it up. The correct code is:

$descrip =~ s/&(?!amp;)/&amp;/g;

I tested it, too. Works fine.

[edited by: phranque at 10:17 pm (utc) on Dec. 17, 2008]
[edit reason] disabled graphic smileys ;) [/edit]

5:39 am on Dec 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was asked to post a link to a reference for the syntax I just used, so here it is:

Perl.com man page on regular expressions [perl.com]

You can also see this by typing "man perlre" from the command line.

7:54 am on Dec 18, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



thanks for pointing out that syntax, MB!
it reminds me to learn more about and use the perl extensions to regular expressions.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month