homepage Welcome to WebmasterWorld Guest from 54.167.185.110
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
PHP Crawler Questions
Access content that is user & password protected
wfernley




msg:3776282
 10:22 pm on Oct 29, 2008 (gmt 0)

Hey everyone,

I have a bit of an interesting project. I need to create a PHP web crawler that will automatically login to a site and grab content from web pages.

I have created crawlers before but I have never used them to access password protected pages. The pages are password protected via a HTML login form and encrypted via SSL.

Has anyone created something like this? The main stumbling block for me is just getting past the login form. I don't know how to create the script to automatically login to access the pages. Once logged in it would need to follow links.

The use of this script is completely legit. It is for a reseller looking to grab product information from their distributor. Considering there are thousands of products, they want to automate the process. Unfortunately, the distributor doesn't offer any type of feed.

Can anyone give me some advice on how to get it to work?

Thanks in advance! :)

Wes

 

MattAU




msg:3776340
 12:17 am on Oct 30, 2008 (gmt 0)

Check out [php.net...]

Then search google and you'll find heaps of examples.

wfernley




msg:3776345
 12:35 am on Oct 30, 2008 (gmt 0)

Sounds good! Thanks for the link. I'm assuming I will incorporate cURL into AJAX?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved