Forum Moderators: phranque
What I'd like is something I can use during development to spider my site and generate a similarity report. I have similar tools for checking broken links and reporting anchor text reputation, but nothing I can use to measure similarity.
Any suggestions?
I'm looking for something in a web spider that generates a similarity matrix before and after parsing out HTML tags.
In essence, it would replicate the Google dupe filter and alert a WM if any pages don't have enough unique content to be indexed separately.