Forum Moderators: phranque

Message Too Old, No Replies

Any advice on spiders?

         

SlowMove

10:45 pm on Jul 26, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm finding out just how easy it is to use Perl and LWP. There seem to be all sorts of useful applications for it. If I'm not using a lot of bandwidth, and just fetching a few pages from a site each day, do I need to use code like $browser->agent(whatever) to pretend to be browser, and wait a couple seconds between fetching pages? Does anyone using spiders know what the pitfalls are, and have any general advice?

macrost

11:16 pm on Jul 26, 2004 (gmt 0)

10+ Year Member



First and foremost, respect the robots.txt file if there is one with the site. That would be my main concern if I was building a bot.