fred9989 - 2:42 pm on Apr 27, 2012 (gmt 0)
I broadly agree with you, semantic analysis is clearly at work, but I think it was switched on or "turned up" before Penguin. I noticed a sudden change in Google's behaviour a few weeks ago, when search for "how to stop x" (where x is a personal problem) started producing search results full of how to avoid x, how to overcome x, how to control x, how to prevent x (and vice versa).
Nonetheless, the degree to which search results now produce irrelevant but indirectly related meanings suggests to me that this algo has introduced much greater use of semantic analysis.
At the same time I see no clear patterns, and see little opportunity to anlayze what is happening, which might imply that they have introduced a whole raft of changes in the Penguin update - I don't think that is news.
My sense is that the following factors are implicated in Penguin:
1) There is a big attack on sites with affiliate links; some EMD and small sites which have survived appear to be ones without affiliate links
2) The top ten results seem somehow different in quality and nature to the ones on page 2 and further down, leading me to wonder if they are being treated differently in the final stages of ranking....and yes, I know that doesn't quite make sense, since they obviously are, or they wouldn't be in the top 10, but I can't articulate what it is that I am sensing.
3) most of the sites I see in the top 10 results that have blog comment spam as their backlinks are relatively new sites
4) the sites which have hundreds or thousands of spammy backlinks from blog link networks, the ones that are still in the top 10 results, seem to be older sites which have recently risen to the top of the search results
5) keyword stuffing does not seem to have affected sites' position in the search engine rankings as far as I can see
Your observations would be welcome