aristotle - 8:46 pm on Nov 19, 2012 (gmt 0)
A couple of my sites have some old articles that haven't been touched in any way for at least five years. Since I haven't touched them, they don't have these self-referencing canonical tags. Over the years most of them have been scraped and republished at least a dozen times. Some of them have been republished on blogspot.com and wordpress.com, and a look at their source codes reveals that these self-reference themselves as the canonicals, evidently because the wordpress and blogspot software inserts the tags automatically. But despite this, Google still gives the top rankings to my pages, apparently having long ago marked them as the originals. So in this case, the false canonical tags on the scraped copies didn't trick the Google algorithm.
Most likely there are tens of thousands of old articles on the web that don't have these canonical tags. but Google apparently realizes this and takes it into account when trying to determine which pages are original and which are scraped copies.