Forum Moderators: coopster & phranque

Message Too Old, No Replies

Free software that extracts urls from txt file?

         

mig1234

1:41 pm on Sep 22, 2004 (gmt 0)

10+ Year Member



I need to extract URLs from a text file. This text files contains various text and URLs are scattered in it.

Can you recommend a free software that can do this?

bcolflesh

1:44 pm on Sep 22, 2004 (gmt 0)

moltar

2:03 pm on Sep 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From Cookbook (Recipe 20.3):

[perl]
#!/usr/bin/perl

use HTML::LinkExtor;

my $FILENAME = 'file.html';

$parser = HTML::LinkExtor->new(undef, $base_url);
$parser->parse_file($FILENAME);
@links = $parser->links;
foreach $linkarray (@links) {
my @element = @$linkarray;
my $elt_type = shift @element; # element type

# possibly test whether this is an element we're interested in
while (@element) {
# extract the next attribute and its value
my ($attr_name, $attr_value) = splice(@element, 0, 2);
if ($elt_type eq 'a' && $attr_name eq 'href') {
print "ANCHOR: $attr_value\n"
}
}
}
[/perl]

mig1234

2:29 pm on Sep 22, 2004 (gmt 0)

10+ Year Member



I don't know how to use this code. Can you give me a link to ready made software or script?

bcolflesh

2:31 pm on Sep 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How about:

Offline Extractor
[spadixbd.com...]

mig1234

3:01 pm on Sep 22, 2004 (gmt 0)

10+ Year Member



Thanks bcolflesh!

This software works right the way I need. Only one small drawback, it's not free.

bcolflesh

3:03 pm on Sep 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Try this free one as well:

[focalmedia.net...]