|New Google Theory|
How Google reads incoming links
I've been doing some research on this latest update and how it treated some of the sites in the categories I target, and I have a theory that I'm hoping some of you can either support or tear apart, or both. It involves how Google counts incoming links to a web site.
In the past, you could take a 300-page web site (Site A) and place a link on every page to another site (Site B), and as a result Site B would receive a huge boost. My theory is, now, no matter how many links a site may have from another web site, they will only count as one vote. So the 300 links from Site A to Site B now only count as 1, even though they still display all of them on the backlinks page for Site B (perhaps to confuse us). I have seen A LOT of evidence that supports this... and it could explain why some sites have been crushed while others haven't moved much at all. Obviously there were some other changes involved in this update, but could this be one of them?
We have seen that too. I think you are right, that is one of the things they may have changed. No idea why they did it, any how. I can´t see what problem they are trying to fix with that. Any ideas?
To me it sort of makes sense if they did this.
It would seem that you would have an "inflated" pagerank if every page from the same domain counted. That would mean a site with 10,000 pages could great 10,000 "votes" for a site and have every page count towards another sites pagerank.
It seems logical that all those pages belong to that one domain, so that should count as one vote.
I can see where there might be exceptions to this rule, but I think that might answer Marcos's question about what "problem" there was to fix.
Just my two cents.
Yes, it does make sense... which is why I was kind of shocked that it was working on Google as recently as last month. It could be tied in with less emphasis on anchor text, but I think there may be something more here, as my theory explains.
Interesting find. It would make a lot of sense for them to do that, based on their history of penalizing cross linked domains.
>That would mean a site with 10,000 pages could great 10,000 "votes"
No. Not all pages had the same "vote" power, and the more "votes" they placed, the less individual power for each vote was awarded. That part was working pretty well. Besides, a 10.000 pages sites must had "some" more weight than a 10 pages site. Think of Geocities.
Any how, that "some" weight seems to have been reduced, somehow.
[edited by: Marcos at 12:46 am (utc) on Oct. 3, 2002]
It's quite possible that google has done as you say, Jay, but it's not the whole story. My site dropped from 7 to 13 (not a huge drop, but inconvenient). I can't say for sure how many of the links to my site are duplicates, but the number would be a fairly trivial percentage -- I'd guess around 6 or 7%.
Of course you weren't claiming that consolidating the votes from one site was the only change that had taken place, so I'm not contradicting you. I'd just like to know what the heck the other changes were. My site has way more content and relevance than any of the 12 sites above it.
I don´t agree, It´s no right.
A 10 pages site placing one link to you has the same Pagerank submiting power than most of the hundreds of /~user/doejoe personal pages of any regional ISP that may be linkuing you. The voting power of /~user/student/JSmith is worthless now, if he is "voting" for the same site than any other user. You maybe the most popular guy arround Geocities/Angelfire/Terra personal pages, and get no credit for that. That is no right. And what about the content generators? The LibraryOfCongress.gov (fake) pointing a 10.000 Shakespeare.org (fake)have the same value as one mention to TheTurnerDiaries.nazi (fake). Plain wrong. Not right. CNN mentioning Microsoft 10.000 times has the same value than CNN.COM mentioning My15MinutsOfGlory.com (fake) one time. No right either.
Ok, this is what be found, and is the most completed explanation we have.
Well, this is what we found until now. Mr Vitaplease comments here [webmasterworld.com] seem to be mostly right. In order to fight “Googlebombing” and “Pagerank for sale”, they may have downgraded results when the Keywords is not in some important part of the on-page text, (to stop googlebombing), and the anchortext in ranking may have been tuned down, (to stop Pagerank monetization), specially if the linking pages do not have a good PageRank to begging with. Internal links, and links from interlinked pages may have been tuned down also.
Still, we have sees as many as 200 regional competitive cats easily dominated by unscrupulous Dmoz editors. We have done some testing on that.
To test if we really are in front of a Dmoz dominated Update, we have set up a Aspseek based small search engine, a GNU search engine with a crude PageRank-like ranking system. We have indexed around 1.500.000 pages, using as the starting point 700 Dmoz very competitive dmoz cats, including up to 250 pages per site, following up to 10 outside link, with up to 100 pages per outside link.
What we found was that 59% of our top 20 results on the 100 cat-related competitive Keywords where also top 20 in Googles new index, and 26% of our Top 10 where also Googles Top 10.
But we must also said that we have not been able to find a so-compelling relationship using no-competitive categories. A 2.000.000 pages index with non-competitive regional cats, using non-competitive Keywords, showed a very small correlation between top 10, top 20, and even top 50 results.
So, our working theory right now is, yes, as Don Vitaplease and others are pointing, small changes, probably committed in order to fight both googlebombing and Pagerank commoditization, have affected the index accuracy in many different ways.
We think the index is unbalance, or unless much more unbalanced than the last one, and, as a result, the weight of some previously no-so important characteristics are souped-up, opening the door for abuse.
In our case, souped-up Dmoz weight is the main factor now, prompting my initial, inaccurate claim, about Dmoz empowerment. It does not seem to be the case. Google has not chosen a drastic reduction of popular linking, as I voiced in previous post. That may be the final effect of the Algo change in some of the most competitive cats, but probably not the desired effect. So much for pro-Adwords, anti webmasters/SEOs conspiracy theories, and my apologies to the Googleserfs for so vehemently suggesting that ( just in case they cared :) ). I hope also Herr Googleguy can now stop laughing at me :) :)
But we do think this update and the changes committed are, to say the least, unbalanced, and the new algo is rampantly open to easy abuse. Lets hope Google good old Phd common sense returns soon, and a new, improved update takes place as soon as possible. In the meant time, I guess we can spend our time pointing out spammers to google, they are easy to find now: usually sitting at a Keyword near you, between #1 and #10. ;)
[edited by: ciml at 11:56 am (utc) on Oct. 3, 2002]
For what its worth, my sites had very few cross links and most pages went up or stayed about the same.
If page rank is delivered to the "page" linked to, rather than the "domain/site" as a whole, which is generally accepted, how would this work?
I think its a nice theory but is that the hole in it? If another site points to different pages on another site, how does Google decide which of those pages should get the PR benefit? OK I guess they could simply confer it on to the home page, but surely that would reduce the indexes ability to find inner pages, which Google says its one of it's strengths?
I just searched Google for "weather", and # 2 was the 404 page of the # 1 site. Could this be a clue as to what's going on with the algo?
Similar to chiyo’s concerns about how the theory would determine which pages get PageRank, I am also wondering which pages will give PageRank in a new theory that minimizes PageRank flow from a site that has multiple links to another site.
As a simple example, assume Site C has two pages (C1 and C2) and Site D has two pages (D1 and D2). If C1 links to D1 and C2 links to D2, then which pages give PageRank and which pages get PageRank and how could it be done in a way that transferred less total PageRank than the traditional PageRank theory would indicate.
NotSure. That page has a PR of 9, and lots of inbounds with the word "snip" in their title tags and are generally "snip" themed, among other things.
Voluntarily snipped by martinibuster.
Oh, one thing that hasn't been mentioned is the distribution of PR within a site's own pages. I tend to think that it's still working, but with diminished force, and with greater emphasis on title tags. Here is why I think this:
I no longer ranked for my keywordphrase. Then I noticed in my logs that a surfer found me on Google using "keywordphrase web page" instead of what I optimized for, "keywordphrase web site"
I thought it was weird, until I noticed that I had changed the title of one inner page to "keywordphrase page." It was an experiment I tried, just on a lark, to have an unoptimized keyword sitting in my title tag.
I don't have "keywordphrase web page" anywhere else, apart from in that title tag, and "web page" twice in the body text of that particular page.
Based on that, I'm inclined to believe that even if PR isn't circulating within one's own site, the general theme, especially as hinted at by the title tags of those inbounds, is having a stronger effect.
|My theory is, now, no matter how many links a site may have from another web site, they will only count as one vote. So the 300 links from Site A to Site B now only count as 1,.. |
You put up an interesting thought – good thinking!
Personally I could possibly relate some of my ranking changes to some similar effect.
1. However, if that were the case, people having ten DMOZ listings could see a dramatic drop in Pagerank/rankings. As DMOZ is one site, it would count as one vote.
My main site has several DMOZ listings and I have not yet seen such an effect.
Nor do the frequent postings here at WebmasterWorld conclude that DMOZ counts less, rather the contrary.
Then again Google could decide to leave DMOZ out of such a discount. ;)
Same goes for multiple independent website hostings such as Geocities and others.
Also, Google could decide to put in some kind neutralisation effect if this happens from one site to the other after a certain threshold.
Lets say you are a webmaster signing with a link towards your site from every page of all your clients sites. Google could say, above X links from a site towards one page of the receiving site we just do not count the extra effect anymore.
2. It would take rather intricate Pagerank distribution calculations to distribute 300 links (votes) as one vote to 300 other seperate pages on another site – proportionally. On the other hand maybe Google just values each individual link as the average of the 300 links Pagerank/ranking wise.
I would say Google adds a separate properties file to every site, adding general site to site linking properties.
3. I had always hoped Google would neutralise reciprocal crosslinking between multiple sites over multilple links towards each other. (multiple site interlinking) [webmasterworld.com]. My observations say they have not done this yet on a one link to one site basis, other than the over-blatantly clear linkfarm advertising setups (the january PR0 suprise).
Your algo tweak suggestions would take care of a good part of this effect.
Should the neutralisation of “one-link-to-many-sites-all-linking-back-with-one-link” (but making sure to also link out to other independent sites) happen in a future update, SERPS should look even more dramatically different than after this update.
4. The foremost question is, does Google not cherish every single vote as a vote (if possible)? I would say yes, if these votes are casted from a webpage to which other independant voters (external sites voted).
|In the past, you could take a 300-page web site (Site A) and place a link on every page to another site (Site B), and as a result Site B would receive a huge boost. |
I had a site which had incoming links from about 5 other sites with 900 to 3,000 pages each. A link on each page. "Powered by ..." kinda thing.
Didn't notice any huge boosts. Had PR5 the whole time. Just like now.
but did you not always vote towards the same page with your "Powered by ..." kinda thing?
Also did you not always use the same anchor text?
Was your signature not always at the bottom of the page?
I think Google took care of that type of signing long before.
vitaplease, it's not always the same and most of the time not at the bottom, but I get your point.
But I don't see a point for google to do that.
People will just start using hosts instead of directories.
I manage a 150 page web site that is doing amazingly well in Google (only thanks to free advice in Webmasterworld and other forums :) ).
The site suddenly got 100+ extra links in the April update from one single web site. That site placed us in a template with the best possible words in the link text. I didn't even know they existed! You can just imagine the excitement checking the link: in Google and finding that a site with page rank 8 has decided to put over 100 links to us. Whoohoo!
Regarding the topic of this thread. I have not noticed the slightest indication that Google now counts links differently from before. The site that decided to link to us actually has 25% of all our incoming links. Of course it is not enough to judge form only one site but at least it brings some extra info to the discussion.
By the way, I just checked some of the 100 pages that link to us and I get a 404 on every single one of them and the home page is broken.
Susanne: >>The site suddenly got 100+ extra links in the April update from one single web site. <<
Is the site's theme related to your site's theme?
Well, not really. Our site is about scuba diving and the other site is a huge travel site covering a large chunk of Africa and with plenty of topics. I suppose they mention diving here and there but it would be fairly diluted because of the other content.
Yes, your theory would help to explain results that I have seen. But, so would increasing the importance of the themes being related for the linking and linked sites. I am leaning towards this latter explanation... less complicated.
If the theory about links from one domain being counted as one is true, how do you think subdomains are treated now? Would links from sub1.domain.com and sub2.domain.com be counted as one "vote"?
Anybody's guess on that one. The data I am working with does not have any subdomains in it.