Welcome to WebmasterWorld Guest from 54.147.63.124

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Scraping AJAX based sites?

     
2:58 am on May 2, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


Hi all,

I'm trying to scrape results from an AJAX based site with cURL (or another tool). Is this even possible? I don't have a clue how I should interact with it!

Thanks!

10:07 am on May 2, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 7, 2003
posts:1230
votes: 0


hi erikcw,

this is possible. ajax is based on http so you can request with php just the same as your browser does on that ajax website. the 'only' thing you need to do is to analyze the request your browser sends to the server and pick out the right one.

this whole process tends to be network analysis and you will need a tool like a sniffer or a proxy for that.

if you do not want to go that way, you can analyze the sourcecode of the website using ajax and look into the javascript to find out what request is fired in there. that's the other way round. for this method you need to be experienced with javascript and you should be able to read other peoples source.

these are the two methods i would choose at first, maybe some other and easier approches are available in the javascript forum.

for the php part: use curl or even fopen() to request the data.

--hakre

9:18 pm on May 2, 2006 (gmt 0)

Full Member

10+ Year Member

joined:Nov 19, 2003
posts:291
votes: 0


The site I'm trying to scrape uses https, is this going to prevent me from using a packet sniffer to extract the needed information?
7:59 am on May 4, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Jan 7, 2003
posts:1230
votes: 0


you can do a man in the middle attack on the ssl connection an then sniff it but i don't know how important that is for you. i think i would analyze the sourcecode first here before starting to sniff the ssl connection unless you have a suitable setup at hand to do so.

--hakre