Forum Moderators: open

Message Too Old, No Replies

Duplicate content

         

Foxus

8:27 pm on Aug 6, 2005 (gmt 0)



Hello

I build a website with a CMS customized (php url rewriting, etc..)

i have a BIG probleme of duplicate content, the half on my website has been "whitout snipeting" recently, no title and no description, only the url, no chance of exit in search by google for any requests :( (duplicate content syndrom i think)

i have read then it's more then 70% of similarity in twho page of the same webmsite are considered suspect by googlebot and the sanction is : "the pages are in the duplicate content category"

fort test the duplicate content between two page, , i use this french tool (run whith ALL url of the world) :

[edit: url removed]

in the result you can see two different result by 2 differents algorytms : Dice algorytm and Jacard algorym (more nice the jacard algorytm lol )

Two finaly question :

1) : Google use how algorytm for detect the duplicate content? you got other links more near of the reality for test that? (webmaster tool)?

2) : the limit of more 70% of similarity between 2 pages is the limit for be in the duplicate content?

sorry if it's not the good category for my question, and thank you very much for have readed all my post with my bad english language ;-)

Laurent

[edited by: msgraph at 9:55 pm (utc) on Aug. 7, 2005]
[edit reason] let's leave out third party tools for this [/edit]

Marcia

6:52 am on Aug 7, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are a lot of ways they can detect duplicate or near-duplicate content or replicated pages, mirrored hosts, etc., and about a dozen great papers and patents out there that tell exactly what they look for. It's a lot more than just duplications in text we have to watch out for.