Welcome to WebmasterWorld Guest from

Forum Moderators: phranque

Message Too Old, No Replies

Identifying Duplicate Content Sites/Non-TLD Sites

Duplicate or Other TLD Site?

1:12 am on Nov 19, 2011 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Aug 30, 2002
posts: 2624
votes: 95

I've been working on a project categorising approximately 2 million .eu websites over the last few weeks. One of the final issues is determining if a website is genuinely a .eu website or a site from another TLD being served as a .eu website. The theory is that some purely other TLD site will have no .eu relative <a href= tags (.eu sites will potentially have .eu or site relative anchors). (I've also used link rel="canonical" element to identify some non-eu sites as the canonical element is supposed to be domain specific.)

While some outbound links will be to stats sites or Social Media networks, does the logic that a site with an array of what appear to be navigation links to the same non-eu website is actually a non-eu site being served as a .eu site and is therefore a duplicate content website hold up? Or would it be neccessary to compare these pages with the other TLD website page to see if they are identical and thus duplicate content?