Forum Moderators: Robert Charlton & goodroi
I thought I’d start a list...
Domain
- Age of Domain
- History of domain
- KWs in domain name
- Sub domain or root domain?
- TLD of Domain
- IP address of domain
- Location of IP address / Server
Architecture
- HTML structure
- Use of Headers tags
- URL path
- Use of external CSS / JS files
Content
- Keyword density of page
- Keyword in Title Tag
- Keyword in Meta Description (Not Meta Keywords)
- Keyword in KW in header tags (H1, H2 etc)
- Keyword in body text
- Freshness of Content
Per Inbound Link
- Quality of website linking in
- Quality of web page linking in
- Age of website
- Age of web page
- Relevancy of page’s content
- Location of link (Footer, Navigation, Body text)
- Anchor text if link
- Title attribute of link
- Alt tag of images linking
- Country specific TLD domain
- Authority TLD (.edu, .gov)
- Location of server
- Authority Link (CNN, BBC, etc)
Cluster of Links
- Uniqueness of Class C address.
Internal Cross Linking
- No of internal links to page
- Location of link on page
- Anchor text of FIRST text link (Bruce Clay’s point at PubCon)
Penalties
- Over Optimisation
- Purchasing Links
- Selling Links
- Comment Spamming
- Cloaking
- Hidden Text
- Duplicate Content
- Keyword stuffing
- Manual penalties
- Sandbox effect (Probably the same as age of domain)
Miscellaneous
- JavaScript Links
- No Follow Links
Pending
- Performance / Load of a website
- Speed of JS
Misconceptions
- XML Sitemap (Aids the crawler but doesn’t help rankings)
- PageRank (General Indicator of page’s performance)
Have seen the same
Maybe I'm misunderstanding...
In the "Penalties for the domain" list, I'd add CSS processing (excuse my english) for "display:none" CSS
I understood the preceding to mean using display:none via CSS is a penalty, and I know it's not because I've seen it used on well ranking websites, have used it myself, and now you're saying you've seen the same thing, which means setting the display to none with css is not a penalty.
A factor in rankings at all?
Possibly.
I have to chime in again. Currently I don't see evidence of keyword in H tags as a ranking factor. Those days are behind us, I think - probably too much abuse. It's still a good idea, but the ranking magic seems to be gone.
Is it worth a lot when everything is taken into account? No tedster...I agree.
The question is, "Is it worth something?"
If we limit the factors to strictly on-page factors, I still think they have meaningful value. Title first, headers second. Most troublesome are heading tags that are not related to the page topic.
"On-page Factors" may be worth it's own thread.
I have the ability to build as many pages as I want on a .edu (two different .edu domains in fact, and one at a supercomputing center).
I could never use these for unapproved purposes without violating all sorts of agreements, but for perfectly legitimate reasons I of course have content on pages there.
Often, buried as they are within a giant mult-headed hydra domain, I have trouble even getting them indexed, let alone to rank for very specific phrases on the pages themselves. With the thousands, possibly millions of pages on that domain, I have trouble even bringing it up in a site: search for very specific terms (and I'm talking simple, unstyled, 1995-era basic HTML pages).
So I tend to believe Matt Cutts is right. Most of the magic of .edu has to do with acquired authority. I don't doubt that a link from the home page would be valuable, but it has a PR of 8, 267K links in Majestic, 2 million links to the domain in Yahoo and 676 listings in the Yahoo directory.
1st page for competitive one worded 2-3 worded terms I don't feel the long tails get much attention from manual reviews.
But bwnbwn, the first page of what? Given that the a large percentage of searches have never been seen before, I can't see this being true.
First page of results for "buy laptop" maybe, first page of results for "Farel exile 1538", I don't think so (and that's not one of the obscure searches I would do, that would be a "major" search term).
Given that the a large percentage of searches have never been seen beforeI would consider these long tail searches and as I said not part of the manual review process.
You do know the difference between an algorithm and heuristic and why it's an important distinction to draw when attempting to understand search results, right?
But in our group, we vastly rely on algorithms. We try to write new techniques and algorithms. But if someone writes in and says I typed in “Rob Hof” and got #*$!, they’re really unhappy if the reply is well, we think we’ll have a new algorithm to deal with that in about six to nine months, so check back and the #*$! may be gone maybe by the end of the year. So we’ll take action. Even then, we try to do it in a scalable way.
Source Linked Here:
[webmasterworld.com...]
They may incorporate the input from human review into the heuristics they use in the future, but I think it's a stretch to say human input becomes part of the mathematical equation, on-the-fly. It may be used to remove a site, but not score a site initially... To me there's a difference.
<ot>
Let me try to be more clear: I think the underlying algo has 200+ variables, and there is the manual ability to 'filter' the results of the 'algo' (algo is used for convenience although that's not what it really is) for certain areas we have absolutely no control over, because a 'filter' can be used to make an on-the-fly adjustment and then be incorporated into the underlying mechanism at a later date, which could (would) explain how some sites are 'unfairly filtered' and then 'pop' back into the results with no significant changes...
They did not do anything outside of accordance with the 'algo' or 'underlying mechanism' but were rather hit by a 'broad filter' to 'clean up' a set of results for a period of time until the underlying mechanism is adjusted to interpret all scoring differently, and I draw a distinction between the two where maybe you don't... In my mind a filter applied manually would only apply to a segment of the searches (sites, results), where the algo or underlying mechanism applies to all searches (sites, results)... There's big difference in my mind.
</ot>
There I go again tedster... I don't know what it is about this forum, but I do purposely stay out of the 'new to web development' forum, if that means anything :) lol.
I think it's a stretch to say human input becomes part of the mathematical equation, on-the-fly
reference:
For each web page/site identified as favored and non-favored, the editors may determine an editorial opinion parameter for that site... For each web page in the result set that is associated with one of the web sites in the set of affected web sites, the server may determine an updated score using an editorial opinion parameter for that web site.US Patent 7096214 [patft.uspto.gov]
Not to split hairs too much here - but after human raters establish an editorial parameter, then it can become either a plus or minus factor in the ranking calculation. And the human raters are simply given a set of results to rate - not necessarily a currently live set. They may well be evaluating and experimental algo tweak before it goes live, right?
So call this factor an algorithm or a heuristic, it's very much a factor and Google is still hiring these folks.
Most troublesome are heading tags that are not related to the page topic.
That's one reason Google can't depend on them. <h1>Hurry, these deals won't last</h1> and the like - I see it all the time.
Not to split hairs too much here - but after human raters establish an editorial parameter, then it can become either a plus or minus factor in the ranking calculation. And the human raters are simply given a set of results to rate - not necessarily a currently live set. They may well be evaluating and experimental algo tweak before it goes live, right?
Son of a Motherless Goat... I think you might have me!
It doesn't happen very often, and Cutts' comments seem to suggest the 'human factor' doesn't have an immediate change in the algo (underlying mechanism), but maybe other segments of the overall algo have a 'human factor' that do?
@ bwnbwn: My apologies... I didn't think humans had an immediate effect on the overall ranking of sites, but it appears they might.
Excuse my ignorance, but isn't this discussion about PageRank and the factors that determine it? and I don't mean "toolbar pagerank" (a factor I pay zero attention to), but the algorithm Google use to rank a page in the SERP's.
No. PageRank is a measure of relative link "citation importance" that's used by Google. The name can be confusing. PageRank is not the same as the rank or ranking of a page in the serps, and it is not the algorithm. You might want to call it a sub-algorithm, and it's one of the 200 factors we're discussing that make up the overall ranking algorithm.
While it was once the core measure of inbound link weight, it's now one of many likely inbound linking factors that Google uses to assess the quality and importance of inbound links. Other linking measures involve trust, anchor text influence and other relevance factors, IP relationships, link age, and link quality as might be assigned by page/site history, etc. PageRank itself is query independent.
Here's one of the original PageRank papers, written by Sergei Brin and Larry Page, the founders of Google. It's said that the "Page" in "PageRank" refers to Larry, not to the page that PageRank numerically scores....
The Anatomy of a Large-Scale Hypertextual Web Search Engine
[infolab.stanford.edu...]
Most if not all websites on the first page probably get some type of manual review and scored
Many of those top searches are the same from day to day and many of the sites listed in the SERPs are also the same, but new ones will show up, so if there's anything to this theory, those new ones would get an immediate manual review (again, only when they show up on the "top searches" list).
Thus, manual review could be a factor in the most competitive SERPs, but I would think it would rarely if ever play into the ultra niche stuff.
.....................
Excuse my ignorance, but isn't this discussion about PageRank and the factors that determine it? and I don't mean "toolbar pagerank" (a factor I pay zero attention to), but the algorithm Google use to rank a page in the SERP's.
PageRank is not what we are talking about here. PageRank only refers to the relative importance of a page based on the links coming to it.
It is one of the weightier 200 or so factors.
This thread is about the 200 factors that determine where a page shows up in the SERPs relative to a keyword string.
The post mentioned recent Google comments, that ranking is only "largely" determined algorithms but human reviewers are involved in the ranking process. This is surprising information for many internet users. According to a Google spokesman, around 10,000 human reviewers currently working for them.
This has been going on for several years now but will help more in the math as to the number of sites (search terms) that can be reviewed in a day.
I myself have figured the human process figured into the algo has gained more weight in the algo as Google has refined the equation and ranking factors.
[edited by: tedster at 7:56 pm (utc) on Nov. 25, 2009]
[edit reason] I paraphrased the quote [/edit]
Also, do you have a sense of whether the editorial opinion factor can have be a positive (i.e. raise you above a site that has not been human checked) or is it only negative (penalties applied for low quality sites)?
I've also heard from a trusted source at a conference that no single reviewer has this kind of power. It takes independent agreement among all who are assigned a given SERP, and there can be no contrary assessment if an editorially determined factor gets added for a URL.
I would think # of spam reports for a given search would indicate a need to hand check related searches.
Yes, I'm sure. However, as we've noticed, spam reports rarely result in a quick removal of the URL, even in some very egregious cases. It is usually the case (and Google reps say this too) that the reports are used for improvements to the algo.
Who then do you think that should manually review it? Should it be a half-robot or a half-genius from the human network over there? Haha, what if Matt Cutts has his own "I like" button that's part of the "larry" page rank? And when he pushes the button he sighs and thinks aah, when will they build an intelligent artificial being like myself to push this button for me?
In other words:), I also agree that we can leave aside the humanity in the manual human reviewers army doing the work of not yet born robots.