To set the record straight on how we actually scale compared to the competition, we use human editors to guide an editorial platform. This platform uses sophisticated artificial intelligence algorithms, which, in turn "learn" what's relevant, and what is "not" relevant on a page. Once a statistical training set is "learned" by the AI, the system can then turn to each and every page, as if a human editor had been there, and extract all of the relevant data sets from the subsequent pages. This system can handle millions and millions of unique documents without any trouble, and can be performed as quickly as the underlying hardware will allow.
This system is quite scalable, and provides the best relevancy we've seen to date. We are working very hard on this subject - relevancy. We think it's the key to success for marketers.