Forum Moderators: coopster & phranque

Message Too Old, No Replies

Do 4 things at 1

What sould I use? Threads?

         

Phoog

9:17 pm on Dec 29, 2005 (gmt 0)

10+ Year Member



Hello, I have a problem.
My perl program is working just fine, but I want to speed it up. The process thats taking time is to fetch some external webpages, its about 3-5 pages that I need to fetch every time.

So basicly I want to do like this:
[perl code]
[fetch the pages at the same time]
[more perl code]

Today I fetch em one at the time...

What should I use to speed it ut? I have understod that Threads not a good option.

bennymack

12:38 am on Dec 30, 2005 (gmt 0)

10+ Year Member



I believe the latest buzz word for asynchronous programming in the Perl world is POE.

There's a rather high learning curve involved but in the long run it's the best option. If you decide to use fork your program will eventually become very complex and probably more unmanageable than if you used POE in the first place.

Here's some useful examples:
[poe.perl.org...]

VectorJ

3:08 am on Dec 31, 2005 (gmt 0)

10+ Year Member



In my experience, the most stable way to do this is by forking a process for each webpage and having them report to a single daemon via domain sockets. It sounds much more complicated than it is; all the info you need to achieve this is in the Camel Book.

I don't know how well it would work if the processes are run based on user input on a webpage. My guess is that it wouldn't work well.

VectorJ

3:10 am on Dec 31, 2005 (gmt 0)

10+ Year Member



Additionally, I second bennymack about POE. I've heard that POE is the way to go. The only reason I don't mention it is that I don't have any experience with POE so can't say one way or the other....

Phoog

1:40 pm on Jan 15, 2006 (gmt 0)

10+ Year Member



"In my experience, the most stable way to do this is by forking a process for each webpage and having them report to a single daemon via domain sockets. It sounds much more complicated than it is; all the info you need to achieve this is in the Camel Book."

Thanks for the help!

But the only info about forking I can find doesent match what I want to do.

Do you know of any guide or exapmle that take up this? I cant be the only one that trying to do this, and the Fork manual doesent leave mutch info about this.

Thanks again...

bennymack

7:10 pm on Jan 15, 2006 (gmt 0)

10+ Year Member



I guess I'd need some more info before I could suggest a more suitable solution.

First of all, how is the output of the page fetches handled? I assume you're going to need to have access to it in the parent process.

Here's an example that forks several processes and then the parent reads the child output:


use strict;
use warnings;
use IO::Select;

my $io = IO::Select->new();

for (1 .. 5) {
pipe my $read, my $write;

if (fork) {
# parent
close $write;
$io->add($read);
next;
}
else {
# child
close $read;
my $time = int rand(5);
sleep $time;
print $write "$$ forked and slept $time\n";
print $write "done\n";
exit;
}
}

while (my @ready = $io->can_read) {
for (@ready) {
print scalar(<$_>);
$io->remove($_);
}
}

There's some other important concepts to learn. Like how to correctly use

wait
and
waitpid
to reap child processes.

Phoog

8:46 am on Jan 16, 2006 (gmt 0)

10+ Year Member



Thanks Bennymack!
Will look in to that code later.

And as you assume, I need to continue to handle the data in the parent. I just want the fetching of the pages to work at the same time, when the pages are fetched I need to continue to work with the data. The output from the pages are saved in a hash.

Thanks