Forum Moderators: coopster
I'm looking for a script that :
- spiders a few pages for new links. The script has to follow these links 2 levels deep.
- saves some of the content of these pages (I need just a few chunks of text which can be easily parsed) into some format
- compares this file with an existing mysql database.
I think php is the best language to write this script, but I was wondering if there are scripts which are already doing this? If not, can anyone help me develop this?
Turbohost
I've not seen anything available that will do this, however the Snoopy php class [snoopy.sourceforge.net...] would be a good place to start this project.
It simulates a web browser and has a method for fetching links.
Hope this helps