In an interesting post about memes on Facebook, Lada Adamic, Thomas Lento, Eytan Adar and Pauline Ng explain about the data science when there's a lot of data available to analyse.
A meme is an idea that is readily transmitted from person to person. But we humans are not perfect transmitters. While sometimes we repeat the idea exactly, often we change the meme, either unintentionally, or to embellish or improve it. Facebook Study: "The Evolution of Memes on Facebook" [facebook.com]
In September of 2009, over 470,000 Facebook users posted this exact statement as their status update. At some point someone created a variant by prepending "thinks that'' (which would follow the individual's name, e.g., “Sam thinks that no one…”), which was copied 60,000 times. The third most popular variant inserted "We are only as strong as the weakest among us'' in the middle. “The rest of the day” at one point (probably in the late evening hours) became “the next 24 hours”. Others abbreviated it to “24 hrs”, or extended it to “the rest of the week”.
In all, using anonymized data, we detected 121,605 different variants of this particular meme which appeared in 1.14 million status updates.