homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

Scraping AJAX based sites?

 2:58 am on May 2, 2006 (gmt 0)

Hi all,

I'm trying to scrape results from an AJAX based site with cURL (or another tool). Is this even possible? I don't have a clue how I should interact with it!




 10:07 am on May 2, 2006 (gmt 0)

hi erikcw,

this is possible. ajax is based on http so you can request with php just the same as your browser does on that ajax website. the 'only' thing you need to do is to analyze the request your browser sends to the server and pick out the right one.

this whole process tends to be network analysis and you will need a tool like a sniffer or a proxy for that.

if you do not want to go that way, you can analyze the sourcecode of the website using ajax and look into the javascript to find out what request is fired in there. that's the other way round. for this method you need to be experienced with javascript and you should be able to read other peoples source.

these are the two methods i would choose at first, maybe some other and easier approches are available in the javascript forum.

for the php part: use curl or even fopen() to request the data.



 9:18 pm on May 2, 2006 (gmt 0)

The site I'm trying to scrape uses https, is this going to prevent me from using a packet sniffer to extract the needed information?


 7:59 am on May 4, 2006 (gmt 0)

you can do a man in the middle attack on the ssl connection an then sniff it but i don't know how important that is for you. i think i would analyze the sourcecode first here before starting to sniff the ssl connection unless you have a suitable setup at hand to do so.


Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved