We have thousands of short articles (usually 1-3 paragraphs) on our site written by a dozen writers. We happened to find some articles that were very similar to each other (maybe 3 words different between paragraphs).
Does anyone know of a tool that we could use to go through our database and find articles that are very similar to each other? We're using MS SQL, although we could probably port the data to MySQL for this project. As an alternative, we could write the text out to text files if the program couldn't directly access the DB.