Forum Moderators: open
There seems to be a "secret sauce" factor involved:
selected these sites by considering many factors
There are posts kicking around here somewhere also, that explain it as, "if A and B link to C, then A and B are similar pages", or something like that... might have that mixed up.
I am a bit worried about this feature. An associate of mine never sees his page in the same SERPs as mine even though we don't link to each other. The pages are on similar topics and should show up in the same SERPs.
A related: search of his site shows many of the sites in my network most of which are not on his topic. This leads us to think that Google has somehow merged his site and our site(s) in their index. And furthermore that they now consider them too similar to list in the same SERPs.
This is obviously a concern for those of us who maintain more than one site on a topic even if we promote them independently. Somehow Google may connect them together and render the second site impotent.
There are posts kicking around here somewhere also, that explain it as, "if A and B link to C, then A and B are similar pages", or something like that... might have that mixed up.
It was "if A links to B and C, then B and C are similar pages".
What are the returns for "Similar Pages" [webmasterworld.com]
Yep. For one of my sites some showing up as "Similar Pages" to the home page I can't figure out why they are showing. Different languages, no common linkage I can spot, etc. My guess is very few Google users ever click on that, and getting this right isn't a Google priority.
For example, if I search for "keyword1 keyword2" then site A appears somewhere in the top 10. If I search for "keyword2 keyword1" site B appears.
The same 2 sites don't show up anywhere together for competitive keyword searches. One site is always relegated to the low part of the top 100.
I have been trying to find an explanation, and I can only find 2.
1) Google's algorithm can pick out similar content in a very sophisticated way. Perhaps Latent Semantic Indexing. This is a distinct possibility since our sites are similar in the on page semantics, but the prose is completely different. This would seem to rule out LSI.
2) They are using a map of the links between sites in the index to group sites into similar topics, and they prefer not to list similar sites together close in the SERPs for any particular query. This is also quite possible as one of the sites affected is on a distantly related topic and uses unrelated semantics to the other sites apart form the main keywords.
Have been thinking the same. But how can anything other than a human be able to work that out. Spotting themes by algo is one thing but if the prose is completely different?
Do we know if LSI can get this level of sophistication?
Is it the imapct of links in some way?
J
Johnser, IMO determining page topic without relying on KWs is the main goal in the use of Semantics by G, doubt if we're near to that yet though....