I have a bit of an interesting project. I need to create a PHP web crawler that will automatically login to a site and grab content from web pages.
I have created crawlers before but I have never used them to access password protected pages. The pages are password protected via a HTML login form and encrypted via SSL.
Has anyone created something like this? The main stumbling block for me is just getting past the login form. I don't know how to create the script to automatically login to access the pages. Once logged in it would need to follow links.
The use of this script is completely legit. It is for a reseller looking to grab product information from their distributor. Considering there are thousands of products, they want to automate the process. Unfortunately, the distributor doesn't offer any type of feed.
Can anyone give me some advice on how to get it to work?