Forum Moderators: coopster

Message Too Old, No Replies

possible to use PHP to scrape data automatically?

         

dulldull

4:00 pm on Jun 26, 2008 (gmt 0)

10+ Year Member



I'm trying to do a data mining on a set of XML data. The data source has over 200,000 entries every day and are updated every second, but their API only allows public users to access 500 entries each time. So I plan to write a small script and keep my server synchronized with the data server every minute.

As I'm only familiar with PHP, I'm wondering if it's possible to make it automatically run every minute?

RonPK

4:05 pm on Jun 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's probably easiest to have your server call the script every minute. On Linux and the likes the magic word is crontab; on Windows it is something like scheduled tasks.

dulldull

8:11 pm on Jun 26, 2008 (gmt 0)

10+ Year Member



Thanks a lot RonPK, that's really useful!

janharders

8:25 pm on Jun 26, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



if you cannot set up cronjobs on your server, there are services which let you define crons that request and thereby invoke your script at the specified times. An example (and I'm sure there are english-speaking ones aswell) is <snip>

[edited by: dreamcatcher at 10:25 pm (utc) on June 26, 2008]
[edit reason] No urls please! [/edit]

eelixduppy

6:04 am on Jun 27, 2008 (gmt 0)



Make sure that you aren't in any violations of the Terms of Service of that company's API. Usually data mining in this manner, especially when you're saving the content to your server, isn't in line with their policies. Check that out first before you cause trouble for yourself. You might need to get written consent before proceeding.

dulldull

7:19 pm on Jul 1, 2008 (gmt 0)

10+ Year Member



I think it's okay to do it in my case, but they just told me if i apply for authorization, they can let me access more entries of xml data each time. Anyway, thanks for your advice.

And i'm going to check what's Snip~ :>