Welcome to WebmasterWorld Guest from 54.224.57.95

Message Too Old, No Replies

Google Algorithm - What are the 200 Variables?

     
12:54 pm on Nov 23, 2009 (gmt 0)

10+ Year Member



At PubCon, Matt Cutts mentioned that there were over 200 variables in the Google Algorithm.

I thought I’d start a list...

Domain
- Age of Domain
- History of domain
- KWs in domain name
- Sub domain or root domain?
- TLD of Domain
- IP address of domain
- Location of IP address / Server

Architecture
- HTML structure
- Use of Headers tags
- URL path
- Use of external CSS / JS files

Content
- Keyword density of page
- Keyword in Title Tag
- Keyword in Meta Description (Not Meta Keywords)
- Keyword in KW in header tags (H1, H2 etc)
- Keyword in body text
- Freshness of Content

Per Inbound Link
- Quality of website linking in
- Quality of web page linking in
- Age of website
- Age of web page
- Relevancy of page’s content
- Location of link (Footer, Navigation, Body text)
- Anchor text if link
- Title attribute of link
- Alt tag of images linking
- Country specific TLD domain
- Authority TLD (.edu, .gov)
- Location of server
- Authority Link (CNN, BBC, etc)

Cluster of Links
- Uniqueness of Class C address.

Internal Cross Linking
- No of internal links to page
- Location of link on page
- Anchor text of FIRST text link (Bruce Clay’s point at PubCon)

Penalties
- Over Optimisation
- Purchasing Links
- Selling Links
- Comment Spamming
- Cloaking
- Hidden Text
- Duplicate Content
- Keyword stuffing
- Manual penalties
- Sandbox effect (Probably the same as age of domain)

Miscellaneous
- JavaScript Links
- No Follow Links

Pending
- Performance / Load of a website
- Speed of JS

Misconceptions
- XML Sitemap (Aids the crawler but doesn’t help rankings)
- PageRank (General Indicator of page’s performance)

4:05 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I wonder how many of the factors we can actually do something about. I also wonder if the 80:20 rule applies. If these two things are combined there are probably 20 or 30 factors that are worth trying to do something about.

Cheers

Sid

4:08 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Have seen the same

Maybe I'm misunderstanding...

In the "Penalties for the domain" list, I'd add CSS processing (excuse my english) for "display:none" CSS

I understood the preceding to mean using display:none via CSS is a penalty, and I know it's not because I've seen it used on well ranking websites, have used it myself, and now you're saying you've seen the same thing, which means setting the display to none with css is not a penalty.

A factor in rankings at all?
Possibly.

4:14 pm on Nov 24, 2009 (gmt 0)

10+ Year Member



now you're saying you've seen the same thing, which means setting the display to none with css is not a penalty.

I am saying if it is done for hiding textual elements or other important "ranking" elements. And that it won't find everyone who deploys this technique.

4:47 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are a few DHTML/Javascript effects that rely on manipulating display. It would be a shame if this was outlawed by some algorithmically applied penalty.

Cheers

Sid

5:10 pm on Nov 24, 2009 (gmt 0)

10+ Year Member



I have to chime in again. Currently I don't see evidence of keyword in H tags as a ranking factor. Those days are behind us, I think - probably too much abuse. It's still a good idea, but the ranking magic seems to be gone.

Is it worth a lot when everything is taken into account? No tedster...I agree.

The question is, "Is it worth something?"

If we limit the factors to strictly on-page factors, I still think they have meaningful value. Title first, headers second. Most troublesome are heading tags that are not related to the page topic.

"On-page Factors" may be worth it's own thread.

6:17 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



.edu links have been so heavily gamed and spammed and bought from student accounts, I don't think the TLD itself has much value.

I have the ability to build as many pages as I want on a .edu (two different .edu domains in fact, and one at a supercomputing center).

I could never use these for unapproved purposes without violating all sorts of agreements, but for perfectly legitimate reasons I of course have content on pages there.

Often, buried as they are within a giant mult-headed hydra domain, I have trouble even getting them indexed, let alone to rank for very specific phrases on the pages themselves. With the thousands, possibly millions of pages on that domain, I have trouble even bringing it up in a site: search for very specific terms (and I'm talking simple, unstyled, 1995-era basic HTML pages).

So I tend to believe Matt Cutts is right. Most of the magic of .edu has to do with acquired authority. I don't doubt that a link from the home page would be valuable, but it has a PR of 8, 267K links in Majestic, 2 million links to the domain in Yahoo and 676 listings in the Yahoo directory.

7:53 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member



One of the 200 should be for manual review scores. Most if not all websites on the first page probably get some type of manual review and scored. How often who knows depending on the number of terms you rank for the site could get many manual reviews.

1st page for competitive one worded 2-3 worded terms I don't feel the long tails get much attention from manual reviews.

8:09 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Administrator buckworks is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Link Rot vs All Links Functional

While the algo seems to be merciful about occasional broken links, there must certainly be a threshold where too much link rot would start to be weighted as a negative "signal of quality" against the site.

8:14 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Speaking of links...

Churn, Age and 'Freshness' (which is slightly different than overall age) don't seem to have been mentioned yet...

[edited by: TheMadScientist at 8:22 pm (utc) on Nov. 24, 2009]

8:22 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From tedster above:
>>I have to chime in again. Currently I don't see evidence of keyword in H tags as a ranking factor. <<

I'm near certain I heard Bruce Clay say this same thing at PubCon.

8:25 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



>>Most if not all websites on the first page probably get some type of manual review

But bwnbwn, the first page of what? Given that the a large percentage of searches have never been seen before, I can't see this being true.

First page of results for "buy laptop" maybe, first page of results for "Farel exile 1538", I don't think so (and that's not one of the obscure searches I would do, that would be a "major" search term).

8:40 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



first page of what?

First page of where more like. In the quaint oldy worldy countries across the water we don't add up to enough bucks for anyone to take any notice unless you get reported. Then they have a look to see if they can find anything to apply inside the real world of the 51 states.

9:00 pm on Nov 24, 2009 (gmt 0)

5+ Year Member



Where the USA has 50 states and the rest of the world is the 51st state? ; )
9:16 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member



ergophobe true about
Given that the a large percentage of searches have never been seen before
I would consider these long tail searches and as I said not part of the manual review process.
but I know without a doubt Google has stats of repeated searches and these are most likely the ones that kick in manual reviews of sites that are on the 1st page.
9:39 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Uh, human editorial input may be part of the results, but...

This thread is about Google's Algorithm, which by definition is a mathematical equation and rules out 'human review', so whether some search are reviewed by hand or not is really a different discussion, IMO.

10:01 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member



TheMadScientist it is all part of the algo bro and figured into the algo.
Human reviews have been a part of google for almost 5 years now I find it hard to believe it isn't added into the algo. IMO
10:16 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Well Bro, judging from the following Matt Cutts quote I would say there is a distinction between human input and the algo, but feel free to think as you like, regardless of what an algo, which is technically a heuristic as used by Google, is...

You do know the difference between an algorithm and heuristic and why it's an important distinction to draw when attempting to understand search results, right?

But in our group, we vastly rely on algorithms. We try to write new techniques and algorithms. But if someone writes in and says I typed in “Rob Hof” and got #*$!, they’re really unhappy if the reply is well, we think we’ll have a new algorithm to deal with that in about six to nine months, so check back and the #*$! may be gone maybe by the end of the year. So we’ll take action. Even then, we try to do it in a scalable way.

Source Linked Here:
[webmasterworld.com...]

They may incorporate the input from human review into the heuristics they use in the future, but I think it's a stretch to say human input becomes part of the mathematical equation, on-the-fly. It may be used to remove a site, but not score a site initially... To me there's a difference.

<ot>
Let me try to be more clear: I think the underlying algo has 200+ variables, and there is the manual ability to 'filter' the results of the 'algo' (algo is used for convenience although that's not what it really is) for certain areas we have absolutely no control over, because a 'filter' can be used to make an on-the-fly adjustment and then be incorporated into the underlying mechanism at a later date, which could (would) explain how some sites are 'unfairly filtered' and then 'pop' back into the results with no significant changes...

They did not do anything outside of accordance with the 'algo' or 'underlying mechanism' but were rather hit by a 'broad filter' to 'clean up' a set of results for a period of time until the underlying mechanism is adjusted to interpret all scoring differently, and I draw a distinction between the two where maybe you don't... In my mind a filter applied manually would only apply to a segment of the searches (sites, results), where the algo or underlying mechanism applies to all searches (sites, results)... There's big difference in my mind.
</ot>

There I go again tedster... I don't know what it is about this forum, but I do purposely stay out of the 'new to web development' forum, if that means anything :) lol.

10:59 pm on Nov 24, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Let me get back on-topic:

Uniqueness of Content

12:24 am on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I think it's a stretch to say human input becomes part of the mathematical equation, on-the-fly

reference:

For each web page/site identified as favored and non-favored, the editors may determine an editorial opinion parameter for that site... For each web page in the result set that is associated with one of the web sites in the set of affected web sites, the server may determine an updated score using an editorial opinion parameter for that web site.

US Patent 7096214 [patft.uspto.gov]

Not to split hairs too much here - but after human raters establish an editorial parameter, then it can become either a plus or minus factor in the ranking calculation. And the human raters are simply given a set of results to rate - not necessarily a currently live set. They may well be evaluating and experimental algo tweak before it goes live, right?

So call this factor an algorithm or a heuristic, it's very much a factor and Google is still hiring these folks.

Most troublesome are heading tags that are not related to the page topic.

That's one reason Google can't depend on them. <h1>Hurry, these deals won't last</h1> and the like - I see it all the time.

12:42 am on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member themadscientist is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



Not to split hairs too much here - but after human raters establish an editorial parameter, then it can become either a plus or minus factor in the ranking calculation. And the human raters are simply given a set of results to rate - not necessarily a currently live set. They may well be evaluating and experimental algo tweak before it goes live, right?

Son of a Motherless Goat... I think you might have me!

It doesn't happen very often, and Cutts' comments seem to suggest the 'human factor' doesn't have an immediate change in the algo (underlying mechanism), but maybe other segments of the overall algo have a 'human factor' that do?

@ bwnbwn: My apologies... I didn't think humans had an immediate effect on the overall ranking of sites, but it appears they might.

1:20 am on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member



TheMadScientist no problem I didn't take it personally. That is why this board is so good. Great post brings out great information for all to use. It is good to have opinions that don't always see eye to eye so we all can learn.
I remember the post [webmasterworld.com...] 3 years ago about this subject.
2:23 am on Nov 25, 2009 (gmt 0)

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Excuse my ignorance, but isn't this discussion about PageRank and the factors that determine it? and I don't mean "toolbar pagerank" (a factor I pay zero attention to), but the algorithm Google use to rank a page in the SERP's.

No. PageRank is a measure of relative link "citation importance" that's used by Google. The name can be confusing. PageRank is not the same as the rank or ranking of a page in the serps, and it is not the algorithm. You might want to call it a sub-algorithm, and it's one of the 200 factors we're discussing that make up the overall ranking algorithm.

While it was once the core measure of inbound link weight, it's now one of many likely inbound linking factors that Google uses to assess the quality and importance of inbound links. Other linking measures involve trust, anchor text influence and other relevance factors, IP relationships, link age, and link quality as might be assigned by page/site history, etc. PageRank itself is query independent.

Here's one of the original PageRank papers, written by Sergei Brin and Larry Page, the founders of Google. It's said that the "Page" in "PageRank" refers to Larry, not to the page that PageRank numerically scores....

The Anatomy of a Large-Scale Hypertextual Web Search Engine
[infolab.stanford.edu...]

3:46 am on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Most if not all websites on the first page probably get some type of manual review and scored

That would be impossible due to so many esoteric queries, however I would think that every day it is pretty easy for Google to determine what the top searches were for the previous day. So it would not surprise me at all if they have a team of, for example, 100 people who each examine 10 first page results. That would cover the top 1000 searches @ 10 results/page = 10,000 sites (a load of 100 sites per examiner in my example).

Many of those top searches are the same from day to day and many of the sites listed in the SERPs are also the same, but new ones will show up, so if there's anything to this theory, those new ones would get an immediate manual review (again, only when they show up on the "top searches" list).

Thus, manual review could be a factor in the most competitive SERPs, but I would think it would rarely if ever play into the ultra niche stuff.

.....................

1:28 pm on Nov 25, 2009 (gmt 0)

10+ Year Member



Here's one of the original PageRank papers, written by Sergei Brin and Larry Page, the founders of Google. It's said that the "Page" in "PageRank" refers to Larry, not to the page that PageRank numerically scores....

You learn something new everyday

3:34 pm on Nov 25, 2009 (gmt 0)

10+ Year Member



Excuse my ignorance, but isn't this discussion about PageRank and the factors that determine it? and I don't mean "toolbar pagerank" (a factor I pay zero attention to), but the algorithm Google use to rank a page in the SERP's.

PageRank is not what we are talking about here. PageRank only refers to the relative importance of a page based on the links coming to it.

It is one of the weightier 200 or so factors.

This thread is about the 200 factors that determine where a page shows up in the SERPs relative to a keyword string.

5:19 pm on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member bwnbwn is a WebmasterWorld Top Contributor of All Time 5+ Year Member



Intresting post by an SEO company just last month gives a little more insight into my observations:

The post mentioned recent Google comments, that ranking is only "largely" determined algorithms but human reviewers are involved in the ranking process. This is surprising information for many internet users. According to a Google spokesman, around 10,000 human reviewers currently working for them.

This has been going on for several years now but will help more in the math as to the number of sites (search terms) that can be reviewed in a day.

I myself have figured the human process figured into the algo has gained more weight in the algo as Google has refined the equation and ranking factors.

[edited by: tedster at 7:56 pm (utc) on Nov. 25, 2009]
[edit reason] I paraphrased the quote [/edit]

5:30 pm on Nov 25, 2009 (gmt 0)

WebmasterWorld Administrator ergophobe is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



bwnbwn and Tedster - interesting. Not that they could be considered algo factors, but I wonder what factors other than number of searches trigger that effect. In other words, I would think # of spam reports for a given search would indicate a need to hand check related searches.

Also, do you have a sense of whether the editorial opinion factor can have be a positive (i.e. raise you above a site that has not been human checked) or is it only negative (penalties applied for low quality sites)?

5:59 pm on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member



Also, do you have a sense of whether the editorial opinion factor can have be a positive

For ultra-competitive SERPs, I'm pretty sure that the top 3-5 are exclusively reviewed, non-reviews are not elegible

7:38 pm on Nov 25, 2009 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The patent definitely allows for both positive and negative influence from the human raters. The quote I posted above begins "For each web page/site identified as favored and non-favored..."

I've also heard from a trusted source at a conference that no single reviewer has this kind of power. It takes independent agreement among all who are assigned a given SERP, and there can be no contrary assessment if an editorially determined factor gets added for a URL.

I would think # of spam reports for a given search would indicate a need to hand check related searches.

Yes, I'm sure. However, as we've noticed, spam reports rarely result in a quick removal of the URL, even in some very egregious cases. It is usually the case (and Google reps say this too) that the reports are used for improvements to the algo.

9:52 pm on Nov 25, 2009 (gmt 0)

5+ Year Member



So... from the human perspective, the closer this thread gets to the thruth about the 200 factors in google algorithm, the higher the thread's page rank should get, right? I mean, they know if all this is nonsense or if it gets warmer and warmer by the hour.

Who then do you think that should manually review it? Should it be a half-robot or a half-genius from the human network over there? Haha, what if Matt Cutts has his own "I like" button that's part of the "larry" page rank? And when he pushes the button he sighs and thinks aah, when will they build an intelligent artificial being like myself to push this button for me?

In other words:), I also agree that we can leave aside the humanity in the manual human reviewers army doing the work of not yet born robots.

This 82 message thread spans 3 pages: 82
 

Featured Threads

Hot Threads This Week

Hot Threads This Month