crobb305 - 4:31 pm on May 27, 2012 (gmt 0)
"But the question is: why does Penguin run intermittently if Google are convinced it yields better results? Presumably it must be because it is expensive or protracted to run. If this is so it should give us a strong clue as to its operation. I think this strongly points to Penguin performing a recursive algorithm on linking structures rather than on content. This may explain the strange disconnect between content and ranking."
I see what you're saying. I just don't think we can read anything into the iterative nature of Penguin. Most mathematical algorithms run iteratively, as functions, parameterizations, and available data change. It doesn't make sense to feed old data into the algorithm, and it takes time for Googlebot to spider the web collecting new data. I'm finding that it's taking a month or more for sitewide link removals to get discovered (still cached, still showing in WMT). Until those data are updated, the Penguin output can't change, unless they modify the code/parameters. That's probably the biggest reason for long lulls between iterations: the arduous data-collection process itself.