I have tested it on an xml doc of 300K, I have tested it with an xml doc of 1.5M. But, the live xml doc is going to be about 16m and that could take a whil e to process.
I am using the xpath module as well, so the whole xml doc needs to be stored in memory so it can be traversed using xpath. It therefore slows quite a bit when file size increases.
To speed things up I figured it might be worth breaking it down into chunks, this would mean less stored in memory. The trouble is though, the xml wouldn't be well formed if I didnt process the whole lot at once.
Anyone got ay suggestions on hw I can speed things up as this is going to chew a load of processing power.
Thanks
Just did another run on the 1.5mb xml doc with the original script, going on the fact that I had time to have my dinner while it processed it (and it hadn't finished when I got back), I think I might need to use twig when I go on to the 16mb document, either that or another 512 ram and an extra couple of days to sit and watch it!
If you need to only process some of the xml doc then twig should be handy as it means you only have to load that chunk into memory. I think the authors are trying to make it so you can navigate this "twig" of the xml doc using an xpath subset.
However, if you need to process the whole lot then I dont think it will help - which I think makes sense, because unless you have everything stored in memory the processor wouldnt know how to navigate it.
I guess another option would be to effectively break one xml doc down into smaller valid xml docs - shouldnt hard due to structured nature of xml.
I'll take a closer look at twig later, so I'll let you know if first impressions were wrong.