Forum Moderators: coopster & phranque

Message Too Old, No Replies

Spider Script

         

Froggyman

5:19 am on May 14, 2001 (gmt 0)



I always wanted my own WWW search engine and finally got one set up [:)]. I don't like the current spider that came with the software which will only index the submitted URL for the META tags. I would like to replace this spider with a better version that can follow hyperlinks within the pages it crawls, even deep crawl if available. I'm looking for any script suggestions (preferably free). Some of the old university scripts seem like they would work great although I'm having a difficult time finding them due to outdated links. Any suggestions?

littleman

5:48 am on May 14, 2001 (gmt 0)



What script is it?
You may want to look at FDSE - it has a very good bot. It is shareware, and $40 to remove the FDSE branding. Maybe Toolman could jump in and tell us about the bot in perlfect search which is open source.

Froggyman

6:20 am on May 14, 2001 (gmt 0)



Thanks Littleman, perlfect [perlfect.com] sounds perfect. Hmmm, it can even index pdf files. Thanks, better than I expected. :)

Froggyman

6:53 am on May 14, 2001 (gmt 0)



Actually, now that I look at it closer, it seems like perlfect would be a great indexing tool for a single website but I have doubts about it's ability to crawl the web. Any thoughts?