But I can't get it to load all the variables on the line.
The input lines look like this:
<weblog name="A young Mennonite" url="http://aym.example.net/" rss="http://aym.example.net/index.xml" when="11491637" xfn="me" />
The xfn="..." portion is not always present.
I attempted it with this (but obviously failed ;-)
my($f_nick,$f_url) = ($1¦¦'', $3¦¦'') if ($pml =~ /<weblog name="(.*?)" url="(.*?)" rss="(.*?)" when="(.*?)"(?: xfn="(.*?)") \/>/s);
David Engel
[edited by: coopster at 4:40 pm (utc) on Sep. 14, 2004]
[edit reason] generalized urls [/edit]
(?: xfn="(.*?)")? You could also snarf all of the tag attributes into a hash. That way you can count on getting the values you need regardless of what attribute order happens to be in the xml.
my %attribs = ($pml =~ m~([a-z]+)="([^"]+)"~g);
my($f_nick,$f_url) = @attribs{qw(name rss)};
use Data::Dumper; # just for fun
print Dumper(\%attribs);
$VAR1 = {
'iwhen' => '11491637',
'url' => 'http://aym.example.net/',
'name' => 'A young Mennonite',
'xfn' => 'me',
'rss' => 'http://aym.example.net/index.xml'
}; If it's important for not-present attributes to be the empty string (You had $foo = ($1¦¦'')), you could
my($f_nick, $f_url) = map { defined $_? $_ : '' } @attribs{qw(name rss)}; instead.
If you're writing a script to parse multiple feeds then parsers like XML::Anything in perl or the built in parser in PHP will break if a feed sends you something non-standard. This is not uncommon and it's not only ignorance that causes it. You can make XML take up a lot less processing time and bandwidth by stripping out some of the standard requirements.
If you're writing a script to parse a single feed it will be far more efficient if you write custom parse code than if you use a parser module, assuming you write it well of course :-)
David Engel