Welcome to WebmasterWorld Guest from 23.22.79.235

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Scraping AJAX based sites?

   
2:58 am on May 2, 2006 (gmt 0)

10+ Year Member



Hi all,

I'm trying to scrape results from an AJAX based site with cURL (or another tool). Is this even possible? I don't have a clue how I should interact with it!

Thanks!

10:07 am on May 2, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi erikcw,

this is possible. ajax is based on http so you can request with php just the same as your browser does on that ajax website. the 'only' thing you need to do is to analyze the request your browser sends to the server and pick out the right one.

this whole process tends to be network analysis and you will need a tool like a sniffer or a proxy for that.

if you do not want to go that way, you can analyze the sourcecode of the website using ajax and look into the javascript to find out what request is fired in there. that's the other way round. for this method you need to be experienced with javascript and you should be able to read other peoples source.

these are the two methods i would choose at first, maybe some other and easier approches are available in the javascript forum.

for the php part: use curl or even fopen() to request the data.

--hakre

9:18 pm on May 2, 2006 (gmt 0)

10+ Year Member



The site I'm trying to scrape uses https, is this going to prevent me from using a packet sniffer to extract the needed information?
7:59 am on May 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



you can do a man in the middle attack on the ssl connection an then sniff it but i don't know how important that is for you. i think i would analyze the sourcecode first here before starting to sniff the ssl connection unless you have a suitable setup at hand to do so.

--hakre