Welcome to WebmasterWorld Guest from 54.147.63.124

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Creating a web spider - What language? Perl, Python, PHP?

     
5:08 pm on Jul 14, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:July 29, 2004
posts:80
votes: 0


I am creating a web spider to gather statistical data from vendors' web pages who don't have a data feed.

The gathered data will go into a MySQL database.

Am I better off going with PERL, Python, or PHP

Custodian

5:18 pm on July 14, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 30, 2003
posts:428
votes: 0


My preference is Perl with WWW::Mechanize, HTML::TreeBuilder, and HTML::TokeParser. Between O'Reilly's Spidering Hacks and the LWP&Perl book, that's all you need to know.

Sean

5:21 pm on July 14, 2005 (gmt 0)

Administrator

WebmasterWorld Administrator jatar_k is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:July 24, 2001
posts:15755
votes: 0


if you know all 3 languages I definitely agree with Sean, if you have more experience in one of them then it may just be easier to write it in that.
5:43 pm on July 14, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:July 29, 2004
posts:80
votes: 0


Yesterday I bought both the O'Reilly's Spidering Hacks book and the LWP&Perl book, hoping they would help get me going in the right direction.

Hopefully I'm off to a good start.

I have had some experience in both PHP and Perl, but have never looked at Python. I've just heard that it has some powerful web crawling capabilities.

Thanks for the input.
I'd appreciate others' experience as well

Custodian

6:05 pm on July 14, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2003
posts:1925
votes: 0


Perl can do a lot for ya! If the job is not extremely complex, I suggest to look at wget and Perl combination. wget has a good crawler already built in. It can feed the data via pipe to a perl script. You can use the perl script just for parsing (what it's best at). Everyone is happy :). Builing a good crawler is a lot of work in any language.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members