For instance ehow has pages and pages on how to boil an egg. Each has a very slightly different title - so "how to boil the perfect hard-boiled egg", "how to boil the perfect soft-boiled egg", "how to hard boil an egg", "how to boil an egg in a microwave"... It costs G time and money to index these pages...
this is what every algo change is about. at the end of the day, this is what every algo change is about now, in my opinion. it's got very little do with providing better quality serps, and more to do with reducing costs for google.
imagine how much it costs google to send out spiders and read billions and billions of pages every day, and store all the different permutations. it must be astronomical. and the web is getting bigger everyday. the costs just spiral up and up.
so what do google do? they make suggestions as to what webmasters should do to get better rankings... which happen to save them a packet of money too.
think about it... what suggestions have google made in the last year or so?
1) speed of pages. they "suggested" that speed might play a part in the algo... so every webmaster immediatly went out and reduced their page weight and offloaded loads of stuff. there was never any evidence that speed played a part when they announced it. and even now they are only saying that it will be used as a "tiebreaker". a tiebreaker? how many different ways are there to rank pages these days? there are hundreds of different onpage and offpage factors.
google is not dumb enough to demote a brilliant page ten places down the serps, just because it takes half a second more to load.
so the only affect that this had was on reducing google's costs -- both in spidering the web, and storing it's pages.
2) punishing "thin" pages. apparently the new rule is this: if your site has some thin pages then your entire site will get punished (unless your "trust" is so high that you can get away with it).
but there is no sense in demoting an entire site (which might otherwise good) because 5% of it is thin. it
makes no sense. the obvious thing would be to just demote those thin pages. but why would google demote the good ones as well? that is like throwing the baby out with the bath water.
for example, lots of blogs have tag pages. it's normal. but even if the actual posts and content is the best in the world, of pulitzer prize quality, and written by william shakespeare himself, google are still "suggesting" that they will demote your entire site because 5% of it is tag pages.
that just makes no sense whatsoever. nobody can suggest otherwise.
but, lo and behold, webmasters all around the world have been panicked into binning thousands upon thousands of pages from their sites and noindexing thousands upon thousands more.
...saving google money on spidering and storing the web.
that is what it is all about. we are just doing google's bidding.
and as for duplicate content... i think google have just given up trying to work out who wrote what thing first. and in a way i dont blame them.
if a good site nicks something from a rubbish one, why should they have to rank the rubbish one first? that is not what users want. they'd rather visit the better site, even if the material is second hand.
if google have determined that the second site is better, what do they actually get by ranking the original writer first? nothing. zip. the only people who care are the writers themselves. but google cares about the searchers, not the writers.
and remember that they are not legally obliged to do it... it costs them money to work out which one wrote it first, because they have to store snapshots of the web from way back when. in my opinion they have just washed their hands of the problem to save themselves money.
that is what every algo change is about now... ways to save google money.