It’s not trying to use ESP.. it’s more a case of ;
‘Statistically we’d expect to see X# of inferences of ‘good phrases;’
More of the algo having an ‘expectation’ and ‘threshold’ for given related core terms. So the scraper page would have an unusually high occurrence and concurrence rate for the core and related phrases.
Further more, pages aren’t ZAPPED – pages obtain added weighting for each of the ‘expected’ page scores it satisfies. It is not about what ‘SHOULD’ be present. Once again it is not establishing the worth of a page, simply a statistical commonality among ALL pages in that themed group within the index.
Also 2 things;
1.Existing ranking methods are still in play (in reality or theoretically)
2.Links still play a large roll in the ultimate ranking in the PaIR method, this has not been accounted for in your thinking
As I have mentioned MORE than a few times it is simply (if at all) a layering onto the existing infrastructure and algorithmic operations. I would personally be turning the dials slowly to make the PaIR methods more prominent over the next year, but who knows that one….
I know my SEO history back to 1995 and have been in the web biz game since 1998 – Indexing and retrieval my studies go back to the early 90s – So, I certainly have a good understanding of the road traveled up to this point.
This is actually the tail end of the PaIR discussions for me… I started digging into this and writing about it last fall…. My fishin head is spinning… he he..
I am always studying/researching something, ‘Personalized Search’ technologies recently captured my eye… ( Wolfie did ‘Local Search’ to death recently)…
So yes Marcia, my name is David…… and I.. am an ‘Aldo-holic’