From what I have read here, the duplicate filter, if it exists at all, kicks in when the page content is between 10% and 15% duplicated by another page. Having said that I have had many pages stolen and my original page and the duplicate seem to have lived happily side by side in the serps for many months before I force the thief into submission. Perhaps someone else could chime in with a more accurate appraisal of duplicates.
If a page exists that links to X and Y, then X and Y are similar.
|From what I have read here, the duplicate filter, if it exists at all, kicks in when the page content is between 10% and 15% duplicated by another page. Having said that I have had many pages stolen and my original page and the duplicate seem to have lived happily side by side in the serps for many months before I force the thief into submission. Perhaps someone else could chime in with a more accurate appraisal of duplicates. |
"Similar Pages" is totally unrelated to the content of pages.
I think this [google.com] is what he is asking about.
Its a little more complicated than that Plasma. I submitted a story to the Google Weblog when MSN had just launched their preview search engine. I pointed out out that MSNbot's FAQ was almost identical to Googlebots, word for word. I got a link, and after that the page rank for my blog shot through the roof, higher than the PR for the blogging community i'm on, and if you searched for similiar pages to my blog, the FAQ and help files for the blogging community itself showed up. Google clearly got confused and thought I somehow had something to do with FAQ's and help files, thereby deeming me "similar".
The context of the text around links plays a big part in similar pages.
Maybe we are talking about different things.
I thought we are talking about
My site is rather small, ~50 pages
From one page (our portfolio) we link to our customers.
All of the sites linked from that page (ours too) are "related" although they have absolutely nothing in common.
again thanks for the opinions
This may help you visualize the concept of GOOGLE simillar:
[edited by: Marcia at 3:48 am (utc) on June 14, 2004]
[edit reason] Links need to be clickable. [/edit]
what does G actually look at with SERPS at
What is the criteria?
Is it matching with title, desc or keywords?
I Think it's more a matter of interlinking of sites to simillar content. If a number of high pr documents link to sites a and b, then sites a and b must be simillar.
I could be wrong.
Ok, I had to put my two cents in, because nobody knows what simillar pages means.
However, I have observed the following, while trying to understand it:
One day I changed the layout of my home page to look like the #1 result - just organized the tables and the HTML sructure the same way #1 did. No change in the SERPs, but the #1 showed as a simillar page to my home page soon after.
Then, as my website start comming up the SERPs, the simillar pages count fell from 31 (the maximum) to 27, where some of the pages link to me, but others don't and they have nothing simillar to my website.
And the most weird thing - I have an inner page which contains a link to AdWords - soon after that the page was showing as a simillar page of AdWords. And now it's gone.
I have just one UFO-related site, non-commercial for now, and well regarded in that strange field. Google lists about 35 "Similar Pages" to my main page (/index.html) out of a field of 2.4 million UFO pages.
Taking a good hard look at those (I know the turf), my distinct impression is its mostly a matter of third sites which link to both A + B. ['A' being my site, 'B' the "Similar Pages". I presume that the number and PR of the 3rd parties must factor in.
As for similar content, I think it plays the lesser role if any. One "Similar Page" to mine is a completely unreferenced link to Altavista .. no mention of UFOs. Some of the other "UFO" pages in the 35 bear little resemblance to mine.
Google has made an algorithmic crusade to see who links to who, and quickly. I really doubt they would devote a black hole of artificial intelligence to determine which skate-boarding sites are intrinsically similar to others. The same would go for UFOs and any other field.