Forum Moderators: coopster & phranque

Message Too Old, No Replies

Links Checker Help

Linking, Links, Reciprocal linking, perl, cgi

         

mrfori

6:20 am on Aug 8, 2003 (gmt 0)

10+ Year Member



Hi I'm using this perl script to check if my exchanged links are still there.

The script works but (even with only 25 links) takes quite a long time.

Any suggestions on how I can Speed it up / Optimize it!

thanks a lot!

#!/usr/bin/perl -w
# use CGI::Carp qw(fatalsToBrowser);
use CGI qw/:all/;
use LWP::UserAgent;
use HTML::LinkExtor;
use URI::URL ();

print "Content-type: text/plain\n\n";

if (param("linkurl")){
$url = param("linkurl");
$returnurl = param("url");
}

$ua = LWP::UserAgent->new;

sub callback { # Callback rutine .. it will be executed everytime a link is found
my($tag, %attr) = @_;
$returnurl =~ s/http:\/\///;
if ($attr{href} =~ /$returnurl/){
# ********************* Link Found htm ****************************
print "Link Found";
print "</body></html>";

exit(0);
}
}

# Make the parser
$p = HTML::LinkExtor->new(\&callback);

# Request document and parse it as it arrives
$request = HTTP::Request->new('GET', "$url");
$ua->request($request, sub {$p->parse($_[0])});

print "Link Not Found";

Storyteller

2:59 am on Aug 10, 2003 (gmt 0)

10+ Year Member



The time a parsing operation like this can take is negligible, comparing to the time needed to pull a web page. You should batch-check your links without re-querying the page with this script for each, and possibly get a faster connection.

mrfori

2:43 am on Aug 11, 2003 (gmt 0)

10+ Year Member



I used like 10 or 25 (depending on how many links)included_once(my_script_name).

Is it a better way of doing it?

Josk

11:32 am on Aug 12, 2003 (gmt 0)

10+ Year Member



Have you tried using more than one process?

mrfori

3:08 pm on Aug 12, 2003 (gmt 0)

10+ Year Member



How do you do that?

Josk

3:20 pm on Aug 13, 2003 (gmt 0)

10+ Year Member



See perlfaq8, or the Camel book.