Forum Moderators: coopster

Message Too Old, No Replies

Amazon AWS

dump to database

         

lorax

6:12 pm on Feb 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've recently signed up and want to build an app that'll query the Amazon db via their XML API and dump the results into a MySQL db where I can do more complex queries and such. I'd like to automate the dump so that it happens every 12 hours.

Has anyone here done this one yet? If so, any tips or gottchas to watch out for?

jatar_k

11:28 pm on Feb 12, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



I guess not. ;)

bird

11:41 pm on Feb 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Um, you want to dump the complete amazon database each night? How big is your data center and how fat is your pipe? ;)

Or if you just want a subset, where are you going to draw the line?

lorax

1:48 am on Feb 13, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



>> the complete db?

Nope - I'm foolish but not that foolish. I want to pull select sections from the catalog. At the moment I'm rewriting some code that'll pull the data in XML format. I'm using this as opposed to the SOAP tools because they're lighter and more reliable - or so say the veterans on the AWS board.

I believe I'll be looking at about 2000-3000 records for what I'm after. It's all text (images are links to the Amazon site) so it should go relatively quick I would think.?

Alternately I may just hand select a list of products that I want to monitor and pull those.

loke

2:44 am on Feb 14, 2004 (gmt 0)

10+ Year Member



You'll probably get a price lag, since prices are updated on an hourly basis. The rest of the amazon database is updated daily. I guess for certain products price changes may be an issue, especially if you plan to have a remote shopping cart. Just something I read on the associates forum, so you might want to check it out with an amazon rep.

lorax

4:30 am on Feb 14, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks loke,
I did a bit of number crunching. I was hoping to pull about 20,000 records. I've calculated the average response time from when a query is sent to when I have a complete result set (10 items) to be about 3 seconds. So to get all 20K would take about 100 minutes. Not very practical for daily updates.

However I've been thinking that perhaps what I should do is do the initial dump once so I'll have the basic data (author, manufacturer, description,etc.) in a db so I can work with the data as I wish - because, really, all I'm after is more flexibility in finding and displaying products. Once I determine what needs to be found I can post a live query to Amazon and get the updated data for those products.