Page is a not externally linkable
dylanz - 5:40 am on Jan 15, 2009 (gmt 0)
1. The file doesn't validate to its DTD. I'm using "xmllint" to verify, and it is indeed broken. Not the end of the world, as I'm running a couple "tr" commands to remove all kinds of funky characters and get the file in working order. 2. Getting the data into my database. Currently, I'm using libxml's Sax parser to read the file via a stream, which keeps the memory footprint low. However, for each node, I'm having to do a read in my database to see if that product exits, then update that record or create a new one if it doesn't exist. This approach is unfortunately going to take hours (it's running 5 hours plus already. Any suggestions? Could I do the process differently, or speed it up in any way? Any feedback will be appreciated! Thanks!
I'm getting a huge product XML file (3gb) from Commission Junction. My current issues are the following, and I would love any insight/suggestions into any of them: