homepage Welcome to WebmasterWorld Guest from 54.204.249.184
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Penalty for more links than average?
Deviating from average could be dangerous.
IITian




msg:214282
 10:04 pm on Dec 19, 2003 (gmt 0)

Google knows which sites are commercial and which are not. (By looking at the urls in the business/commercial sections of major directories.)

For business and non-business sites it can compute statistics on median number of links, average PR of links, standard deviation of these values, percentile values and so on over the age of sites. (Age defined approximately time from when Google knew about the existence of that site.)
------------------------------
Example:
Assume:
For a site that is 1.5 years old:
Median number of links for non-commercials: 12
Median for commercials: 140

Let's say it finds a site with 2200 incoming links. Its chart tells Google that chances of that happening for a typical commercial site is 0.5%. It means either the site is very Good/Popular like links to Lord of the Rings, or has very aggressive SEO.

Now it looks at the median PR tranfer for that site and it appears to be 0.5 (Normalized PR divided by number of links eg PR4 page having 1000 links) and the median for commercial sites is 1.2, placing it in the bottom 1% in terms of median PR transfer, confirming that it is not a Good/Popular site but rather a site that has been scraping the bottom of the barrel to get additional PR from low PR-high outgoing links pages.

Result: Show the external PR to be same as before - ie say 7 but for internal computation purposes use the value of 2.

Conclusion: Don't be too aggressive in getting links.

(All speculation, of course. :-))

 

ThomasB




msg:214283
 8:08 pm on Jan 25, 2004 (gmt 0)

the more popular a site is, the worse is its quality. Is that what you want to say?

nileshkurhade




msg:214284
 8:14 pm on Jan 25, 2004 (gmt 0)

Theory dosent seem to work, has many loop holes, start with -
What if the site is not listed in any directory?

theitboy




msg:214285
 8:31 pm on Jan 25, 2004 (gmt 0)

I think your basic idea is correct, i.e. the over-optimization penalty is probably related to spikes that lie at the extremes of a bell curve, but certainly there are many more factors in assessing this than just link popularity.

nileshkurhade




msg:214286
 8:49 pm on Jan 25, 2004 (gmt 0)

Google is basically a software that is constantly updated by extremely intelligent human beings. Being a software, if a logic is applied it has to be applicable on the entire cross section of the web sites. If any of the above holds true then there can not be a single site that defies the law of averages. Therefore the problem for Google is it cannot apply such generalised rules or else an entire part of the internet may go missing from its database.

But this dosent mean that there are no filters or penalties are not applied. When filters are applied a lot of innocent people also get affected for eg. if links.htm is black listed then all the webmasters whether they are exchanging links or not get affected even if one back link is found from the links.htm page.

agent10




msg:214287
 8:57 pm on Jan 25, 2004 (gmt 0)

Well certainly if you look at some travel sectors tonight a high % of page 1 sites have very few backlinks, generally with the exception of PR8 sites.
So maybe this is another filter or just a coincidence during the dance.
Certainly looking at the very low backlinks for high performing sites as of tonight many highly regarded sites with good backlink numbers but lower than PR8 may be lost!

IITian




msg:214288
 9:31 pm on Jan 25, 2004 (gmt 0)

the more popular a site is, the worse is its quality. Is that what you want to say?

No, I didn't say that. However, if the number of links is abnormally high, and the quality of links (in terms of PR) is abnormally low, one has to think again.

To give a "real-life" anology:
A car company increases it sales drastically - say by 100% - over last year's. Things look very good and the sales manager gets a fat bonus until one finds out that the average profit per car sold has gone down by 90% when compared to figure for last year. It means that there might be some other factors besides some natural high quality of that car that attracted buyers. Later investigations find that the higher sales was mostly because of $3000 incentive and selling cars to people who couldn't afford to buy any car. [On the other hand another company increased it sales by 30% and increased its profits per car by 20%.]

(PS: A future scenario: Those people default on their loans and the car company declares bankrupty. The sales manager vanishes. ;) )

A site may be more popular but why should it lead to lower quality of links?

bird




msg:214289
 9:53 pm on Jan 25, 2004 (gmt 0)

if the number of links is abnormally high, and the quality of links (in terms of PR) is abnormally low, one has to think again.

Then why does Yahoo rank so well in Google?

It has an abnormally high number of backlinks.
Probably more than 99.9 % of those links come from pages with almost no PageRank (most sites linking to Yahoo are of the "me and my dog" variety).

IITian




msg:214290
 10:51 pm on Jan 25, 2004 (gmt 0)

Then why does Yahoo rank so well in Google?

I think we have to compare Yahoo with similar sites. It is difficult to find such sites but just to give an example:

Consider web hosting business. Yahoo is quite interested in it. Its home page mentions it at least twice and a few related terms like domain, personal web site and the the link takes it to a web hosting PR8 page. No SE will mistake that page for anything but a web hosting page. However, in serps for that phrase, its competitors, a few with PR8 but many with just PR6-PR7 pages are able to do better. (Yahoo seems to appear on the 4th page.) Other factors like the anchor text are involved too but part of the reason could be that the average PR of backlinks to these sites are likely higher than the average PR of backlinks to Yahoo.

1milehgh80210




msg:214291
 11:03 pm on Jan 25, 2004 (gmt 0)

the number of links is abnormally high, and the quality of links (in terms of PR) is abnormally low, one has to think again."<<

Well doesn't this make it even harder for new sites to EVER get going?
-unless the content is so fantastic it (somehow?) gets found anyway.
-unless site is put up by someone with high (PR5+)sites they can direct links from.

bird




msg:214292
 12:28 am on Jan 26, 2004 (gmt 0)

I think we have to compare Yahoo with similar sites.

And how does Google determine which sites are "similar"?
By counting which sites have similar numbers and quality of backlinks?

As you noticed, it's all about statistics. If you use statistics to group sites into categories, then you can't use the same statistics to determine "unusual" items within each group. And statistics are pretty much the only thing Google has to figure out anything meaningful, which makes your suggestion extremely difficult to implement. I'd even venture to say it's not possible with reasonable effort.

Time to put yet another conspiracy theory to rest... ;)

IITian




msg:214293
 1:01 am on Jan 26, 2004 (gmt 0)

As you noticed, it's all about statistics. If you use statistics to group sites into categories, then you can't use the same statistics to determine "unusual" items within each group.

I didn't mention how Google determines "similar." Let me illustrate a simpler example. Let's say Google looks into the incoming anchor texts and puts pages into a cluster it calls "travel hawaii" if over 10% of their links contain the above 2 words. This creates a class of "similar" pages for determing serps for the keyphrase "travel Hawaii."

Now it looks at those pages and their keyphrase densities for the above phrase and finds a normal distribution - say with median kw density 7% and about 20 of the 5000 sites have this value above 30%.

It is possible that some of these sites are genuinely not spamming and the high kw density happened because of some other factors and Google has some other criteria to make sure that this is the case and gives 5 of them benefit of doubt but the rest 15 were found "guilty" and penalized. Some innocent sites are going to get penalized and some spammers are going to do ok and that is the fact of using statistics in automated manner. Things need not be 100% correct and most surfers won't notice it.

Google use use one set of criteria to determine which sites are "similar" and uses another to find sites in that group that deviate from the norm.

Of course, Yahoo's case is more complex but Google has years' of data to make path-dependent decisions and can afford to be wrong many times.

nakulgoyal




msg:214294
 6:51 am on Jan 26, 2004 (gmt 0)

The old saying.....comes into the picture: SLOW and STEADY wins the RACE. :-)

PeterD




msg:214295
 7:57 am on Jan 26, 2004 (gmt 0)

IITian, would the clustering and kw or backlink analysis be done on the fly, or during the monthly update? Either way, it seems expensive in terms of computation or storage.

Casey_Eno




msg:214296
 9:34 pm on Jan 26, 2004 (gmt 0)

Im not buying into this ... least not yet, though I wish Google would penalize this!

In my niche I have noticed several webmasters create half a dozen or so sites and crosslink to each site on every single page. Result? Well each site has 500+ links all from PR 4,5,or 6 pages.

Whats that mean? Well it means this grouping of sites are atop the SERPs for numerous highly competitive terms!

Obvious to me that currently there is no penalty for over linking. This is something that should be easy to fix - why do they permit it to exist?

Easy_Coder




msg:214297
 10:16 pm on Jan 26, 2004 (gmt 0)

I am tracking the top 15 results of a popular search phrase on a daily basis and the links to those top 15 on any given day are anywhere from 2200 - 0 depending on the site and on average over 400 for the group of 15.

bird




msg:214298
 1:01 pm on Jan 27, 2004 (gmt 0)

Obvious to me that currently there is no penalty for over linking. This is something that should be easy to fix - why do they permit it to exist?

If it ain't broken, don't fix it.
As far as I know, it's not a crime to have more links to your site than the competition.

Casey_Eno




msg:214299
 1:07 pm on Jan 27, 2004 (gmt 0)

"As far as I know, it's not a crime to have more links to your site than the competition. "

WELL if all those links are coming from half a dozen sites then IT darn well should be. heh!

Sites should not get extra boost for having 50+ links coming from 1 site. This excessive crosslinking is what I was talking about. This is something Google clearly states should not be done in their webmaster guidelines!

DONT participate in linking schemes designed to increase you ranking or pagerank.

dwilson




msg:214300
 1:14 pm on Jan 27, 2004 (gmt 0)

When recommending a penalty like this, think of Amazon.

Amazon has a huge number of links coming into it. It has lots from little sites -- sites that hope to sell just a few items/quarter. It has lots of links coming in from many of those sites -- links to individual items advertised.

And there are other examples, I'm sure.

bird




msg:214301
 8:14 pm on Jan 27, 2004 (gmt 0)

>>"As far as I know, it's not a crime to have more links to your site than the competition. "

WELL if all those links are coming from half a dozen sites then IT darn well should be. heh!

So this whole thread is essentially just about wishful thinking? ;)

Sites should not get extra boost for having 50+ links coming from 1 site. This excessive crosslinking is what I was talking about. This is something Google clearly states should not be done in their webmaster guidelines!

This may be news to some, but more than a year ago we had heated discussions here about "bad neighbourhoods". It was recommended by GG not to link *into* such neighbourhoods, or your page would be put into the same categoyr. Links out of a bad neighbourhood are still likely ignored today.

So what is a bad neighbourhood? The most convincing explanation is about groups of pages (pages, not sites!), which define closed circular linking patterns. Once such a group is identified, which technically is relatively simple, the result depends on whether and by how much those internal links outnumber incoming links from outside the group. The fewer external links there are, the less value is given to the internal ones. Note that this is not about the number of links, just about the relation between different types of links. And it has nothing at all to do with "similarity" of any kind.

Casey_Eno




msg:214302
 10:45 pm on Jan 27, 2004 (gmt 0)

Thank you bird :)

Ive been seeing a lot of these niche neighborhoods doing very well. In fact a couple instances that provoked my post appear to be operated by the same individual. E.G. 1 person that operates 6 sites all interlinked to one another from every single page.

All six sites are very very well represented in the SERPS with dozens of #1s for highly competive terms.

This isnt about singling out these sites, its about bringing attention to the issue of "linking schemes."

Again thanks for understanding the view.

IITian




msg:214303
 12:04 am on Jan 28, 2004 (gmt 0)

It is likely that Google is not penalizing excessive links now, but what about the future? Consider it similar to kw density. Why is 70% density considered to be too high and 0.1% too low. Everything boils down to whether it is normal or not, ie what is the probability of an organic page dealing with a kw topic having a kw density of x. If it is too low, a reg flag is raised, especially if the kw density is too high. This probabilty is determined by statistics.

One mistake we make is that we think that we are free to make mistakes and if things don't turn out well we can always correct them. Unfortunately this is not true in many cases in SEO since it is quite difficult to undo certain things (can't ask everyone to unlink to us) and secondly, perhaps more imporantly, the history is recorded and can be used against us anytime Google has the resouces to put on it.

In another thread I have given the example of people renaming their links pages from links.* to something entirely different. Once Google has the data mining capabilities, it is going to mark these sites for ever - whereas the sites that didn't change their links page names - mostly the non-SEO ones, are going to be marked as non-SEO ones - like .gov, .edu sites and in due time will be rewarded.

Of course everything is hypothetical and speculative here, but given the competition and the main impediment to good search - SPAM - I won't be surprised if sophisticated algorithms are implemented within a couple of years. (Currently, I think their algorithm is quite primitive and has really caused lots of uncertainty for many.)

No amount of clothing is going to "cover" the fact that Google has seen us naked.

allanp73




msg:214304
 12:17 am on Jan 28, 2004 (gmt 0)

Google is obviously increasing it's coverage for its filters.
In the real estate field it used to be only "city real estate" and "city real estate + any term" that were filtered with Austin this has increased to "city homes" and "city homes + any term".
Other terms are not being filtered and SEO is same old same old. I am not too worried. I figure Google will just go filter overboard and users will start using MSN.
For those who wonder how do they know which sites to filter. It's actually fairly easy. Directories and commercial sites look very different. Directories generally link out to many relevant sites where as commercial sites generally avoid linking heavily to direct competitors. Linking out in a directory is spead throughout where as commercial sites place out going links prodominately on a links page. The new ranking system seems to based more on the quality of the directory's links than the quality of the content. Annoyingly this means a lot of the top sites are just directories where my sites are listed. We have seen the end of content is king and it has been replaced by linking to content is king sites.
I imagine the future is bleak for those who do not adapt and switch to a directory style site. I have tested the theory and recovered rankings. Personally, I feel like it discourages developing content.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved