Welcome to WebmasterWorld Guest from 54.167.155.147

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

rss from perl script

creating a rss file from perl

   
11:27 am on Oct 30, 2008 (gmt 0)

5+ Year Member



hello again, my second topic,

I am working on a rss feed, from perl, which takes all of the data from a log...else where.

this is the script:


foreach my $species ($tree->find_by_tag_name('log')){

my $revision = $species->find_by_tag_name('logentry')->attr_get_i('revision');
my $action = $species->find_by_tag_name('path')->attr_get_i('action');
my $actionContent = $species->find_by_tag_name('path')->as_text;
my $editBy = $species->find_by_tag_name('author')->as_text;
my $dateMod = $species->find_by_tag_name('date')->as_text;
my $msg = $species->find_by_tag_name('msg')->as_text;

foreach my $usee ($use->find_by_tag_name('log')){

my $revisionCompare = $usee->find_by_tag_name('logentry')->attr_get_i('revision');
if($revisionCompare > 0 ){
open my $logfile, ">", $file or die "Failed to open rss file, error: $!";
print $logfile $info."\n";
print $revision."\n".$action."\n".$actionContent."\n".$editBy."\n".$dateMod."\n".$msg;
}

}
}

All this does is take the info, and make variables out of them, so how can I now create a rss file?

I hope somebody can help me.

greetings.

9:06 am on Nov 5, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



you should look at the XML::RSS perl module on cpan [search.cpan.org].
5:21 am on Nov 15, 2008 (gmt 0)

5+ Year Member



Hi,

If you are using foreach it'll do the process in a array form, You can use while instead of foreach here.
It'll process line by line...

11:08 am on Dec 16, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Funny, I just wrote a script to do this last week. Once you've got your variables into an array, just build an output string, and write it to a file.

$output = q[<?xml version="1.0" ?>  
<rss version="2.0">
<channel>
<title>My Site Updates</title>
<link>http://example.com/</link>
<description>Blah blah blah blah blah.</description>
];

for $counter (1..15) {
($date,$title,$link,$descrip) = split("\t",$data[$counter]);
$descrip =~ s¦<a href.*>?¦¦g; # strip out HTML tags
$descrip =~ s¦</a>¦¦g; # strip out HTML tags
($month,$day,$year) = split('-',$date);
if (length($day)==1) {$day="0$day";}
$year+=2000;
$output.= qq[
<item>
<title>$title</title>
<link>http://example.com$link</link>
<description>$descrip</description>
<pubDate>$day $months[$month] $year 08:00 PST</pubDate>
</item>\n];
}
$output .= '</channel></rss>';

open (FILE,'>rss.xml') ¦¦ die $!;
print FILE $output;
close (FILE);
2:19 am on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A couple of updates to the above code: This forum changes the solid bar character to a broken bar character (), so you'll need to change that in the code above.

Next, as I discovered in another thread, it's not necessary to strip out the HTML tags. Instead you can just encode them. And while you're at it, you can encode other unsafe characters:

$descrip =~ s/&!(amp;)/&amp;/g;
$descrip =~ s/'/&apos;/g;
$descrip =~ s/"/&quot;/g;
$descrip =~ s/</&lt;/g;
$descrip =~ s/>/&rt;/g;

[edited by: MichaelBluejay at 2:24 am (utc) on Dec. 17, 2008]

[edited by: phranque at 6:22 am (utc) on Dec. 17, 2008]
[edit reason] disabled graphic smileys ;) [/edit]

7:39 am on Dec 17, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



!(amp;)

while it would be handy, that isn't a valid regexp as far as i know.
you can non-match single characters, even from a class list, but not a string.

- i would suggest unencoding the string before re-encoding.
- perhaps you can use "if ($descrip !~ /&amp;/)" to exclude that case but it would be trickier to do global substitutions with multiple ampersands in the string that way.

4:39 pm on Dec 17, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Okay, I looked it up. The correct code is:

$descrip =~ s/&(?!amp;)/&amp;/g;

I tested it, too. Works fine.

[edited by: phranque at 10:17 pm (utc) on Dec. 17, 2008]
[edit reason] disabled graphic smileys ;) [/edit]

5:39 am on Dec 18, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I was asked to post a link to a reference for the syntax I just used, so here it is:

Perl.com man page on regular expressions [perl.com]

You can also see this by typing "man perlre" from the command line.

7:54 am on Dec 18, 2008 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



thanks for pointing out that syntax, MB!
it reminds me to learn more about and use the perl extensions to regular expressions.