Forum Moderators: open
A1 and C1 are affiliated. C1 died in the SERP's.
Present:
A2 links to C1, and C2 links to A1.
A1 and C1 are no longer affiliated...and presumably G cannot see that they are affiliated. But they used to be.
Question:
Does G at some point "forget" that A1 and C1 were affiliated?
If so, then a site we thought was being filtered by the current algo for its affiliation to another site is instead being filtered for another reason. Perhaps a dup filter or some such thing. All the pages of the poorly performing site show good TBPR, FWIW.
According to me
Site A1 links to site B1 links to site C1.A1 and C1 are affiliated. C1 died in the SERP's.
This cant be true as how can Search engines find out that Site A1 and C1 related using some linking pattern (will they keep a track of all linking patterns of all sites? not very much possible but not impossible though).
What hilltop says is
------------------------
Two pages are affiliated conceptually if they are authored by authors from affiliated organizations. According to hilltop theory if Site A1 and Site B1 are affliated (say with same C class IPs) and B1 is affliated with C1 (may be as subsequent right-most token is the same eg widgets.com and wigets.uk).
Hilltop doc says
In practice some non-affiliated hosts may be classified as affiliated, but that is acceptable since this relation is intended to be conservative.
In Short using linking patterns it is not possible to find out whether two sites are related(except if you are not smart). I have written an article on this issue sometime back , how Search Engines can track two sites for relations?
They do
1) Using Same C class IP
2) Related:
3) Through whois
Never leave a pattern with links , use drunken mans path to success.
Have fun
AjiNIMC
Here is a thread I started a looong time ago on the subject: [webmasterworld.com...]
A1 and B1 are on different IP's, but A1 and C1 were on the same C block.
For reference from Hilltop paper (edited out non-relevant points):
We define two hosts as affiliated if one or both of the following is true:*They share the same first 3 octets of the IP address.
The affiliation relation is transitive: if A and B are affiliated and B and C are affiliated then we take A and C to be affiliated even if there is no direct evidence of the fact. In practice some non-affiliated hosts may be classified as affiliated, but that is acceptable since this relation is intended to be conservative.
In a preprocessing step we construct a host-affiliation lookup. Using a union-find algorithm we group hosts, that either share the same rightmost non-generic suffix or have an IP address in common, into sets. Every set is given a unique identifier (e.g., the host with the lexicographically lowest hostname). The host-affiliation lookup maps every host to its set identifier or to itself (when there is no set). This is used to compare hosts. If the lookup maps two hosts to the same value then they are affiliated; otherwise they are non-affiliated.
Perhaps my first mistake was to read this too literally. Because A1 and B1 are not on the same C block, they should not be seen as affiliated. Thus in my reading of the above, they can't connect A1 to C1, since the connection travels thru B1. But, if they compare sites that are two generations away instead of one, then the connection between A1 and C1 becomes more apparent, since they are on same C block and connected by two degrees of separation.
Anyway, we moved C1 to another host many months ago, when it occured to us that C1 may have been affiliated to A1. Still, no rebound for C1. We don't really see any issues in dup (content, templates, WHOIS, etc.). But there is about 60% overlap of the products/services being offered. So we're left wondering: IF they made the connection and suppressed C1, are they still remembering?
We would have thought that since this is related to algorithmic activity, once the issue is cleared and new calcs are performed, the C1 site rebounds. Not so. Leaving us wondering if C1 continues to be remembered as being affiliated with A1.
We also have other, more subtle (internal pages), evidence that G has become more aggressive than ever at nixing out sites/pages that they deem too similar to one and other.
I have no issue with this when the pages are largely dups in terms of what they offer, but when there is only 60% overlap, and the targets are completely different, common ownership alone seems to be a very aggressive defiinition of what constitutes spam, at least IMHO.
Google has related some of our sites that are not even linked together. I don't know how they did it.
Slydog - Do you have the Toolbar installed with PR & Category switched on?
My guess is that you and other people from your organisation visit these websites everyday from the same IP. My opinon is that this could be enough for Google to relate them as you are sending information to them about your browsing.
IMHO, 60% is high if it's with the same kw's, to the same audience, since that really is just candy coated dup content (even though many advocate multiple sites that basically offer the same thing in order to reduce risk...and that's fine with me).
;-)
However, I don't think 60% is high if you are reaching different groups of people on different searches. Vague analogy: The major cereal makers offers *many* different brands. Is it OK that one offers corn flakes modified and repackaged into about seven different brands each with it's own line extensiions, when all those brand represent relatively minor tweaks to one core product, and in fact they are about 85% the same? I think so; they each appeal to different tastes and audiences, and hey, it's a free market. (Of course, many Europeans find our cereal isles absurd, but that another argument.)
But we go OT here a bit. I'll start another one on this perhaps.
Total Paranoia, yes I do believe the TB can play a role, FWIW, but we actually are SO paranoid that we don't use it. On your other comment, whether or not a site "deserves" to be shown in G's SERP's is of course for G to decide. I really don't know what they would think if they did a manual inspection of these two sites.
Does G at some point "forget" that A1 and C1 were affiliated?
Unless Google is maintaining an index of "affiliations", yes, they would be forgotten.
I can't see them maintaining that database, because I'm not sure it provides any meaningful information to them. Not that they don't have the ability, but just that I don't think it would be useful.
My vote is yes, it would be forgotten with the way things are now.
60% is high
In my experience, it's high either way caveman. Regardless of the similarity or difference of sectors/audience.
Jake, when you say 60% is high, I presume you mean with respect to what G is likely to dislike?
MB, you may well be right, especially since it looks like I can't write this off to the algo anymore.
However, a word of caution to any/all paying attention: When I refer to "60% overlap of products/services being offered," I'm talking about the goods/services themselves (like two electronics retailers offering overlapping inventory to the extent of about 60%). Our codes are unique, we don't use feeds or straight manufacturer blurbs, etc, so in order for a SE to pick this up and tag it as duplicate content, they would literally have to compare inventory SKU's, which seems unlikely to me.
There's about a 20% dup in some parts of the code sitewide, but in the center to bottom of the HTML, so that also seemed unlikely to me.
OTOH, I'm sure it's more than just bad luck. ;-)
Hmmmm ...
If this was true, surely google would consider that everyone here was affiliated with Webmaster World, since most of us visit every day.