Welcome to WebmasterWorld Guest from 54.162.239.134

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Curl Scraping Problem

     
10:24 am on Aug 24, 2010 (gmt 0)

10+ Year Member



Hello,

My business uses a lot of CURL to get info from websites. (Not scraping in the sense of MFA websites).

The problem is that CURL scrapes the source code, but I only want to scrape what the user sees - after javascript has been rendered etc.

Is this possible? If anyone could point me in the right direction I'd really appreciate it.

Thanks
10:57 am on Aug 24, 2010 (gmt 0)



For the rendering of javascript, cURL is not the way to go.
You'll need to automate a browser to open the page and render the entire DOM.

seleniumhq.org
watir.com

You could also use the crowbar proxy.
It will render the page in gecko browser and send the rendered DOM to you.

simile.mit.edu/wiki/Crowbar

Hope this helps
8:32 am on Aug 25, 2010 (gmt 0)

10+ Year Member



Perfect, Lostdreamer thank you!

Tom
 

Featured Threads

Hot Threads This Week

Hot Threads This Month