TheMadScientist - 12:17 pm on Mar 21, 2011 (gmt 0)
So, for a big site with hundreds of thousands of pages, is there any way to do this? Any third-party tool or service?
I'm sure a good coder could find a way to make use of one of these:
Of course to do anything with either you basically have to get into writing a bot and parsing the information you get from other sites, which may be beyond many, but would probably be very enlightening for as many or more, even just to try and detect similar text they know exists on another site, because to do it reliably you really have to get into how to extract the main text from the template, and that's definitely a challenge...