Welcome to WebmasterWorld Guest from 188.8.131.52
I thought I’d start a list...
- Age of Domain
- History of domain
- KWs in domain name
- Sub domain or root domain?
- TLD of Domain
- IP address of domain
- Location of IP address / Server
- HTML structure
- Use of Headers tags
- URL path
- Use of external CSS / JS files
- Keyword density of page
- Keyword in Title Tag
- Keyword in Meta Description (Not Meta Keywords)
- Keyword in KW in header tags (H1, H2 etc)
- Keyword in body text
- Freshness of Content
Per Inbound Link
- Quality of website linking in
- Quality of web page linking in
- Age of website
- Age of web page
- Relevancy of page’s content
- Location of link (Footer, Navigation, Body text)
- Anchor text if link
- Title attribute of link
- Alt tag of images linking
- Country specific TLD domain
- Authority TLD (.edu, .gov)
- Location of server
- Authority Link (CNN, BBC, etc)
Cluster of Links
- Uniqueness of Class C address.
Internal Cross Linking
- No of internal links to page
- Location of link on page
- Anchor text of FIRST text link (Bruce Clay’s point at PubCon)
- Over Optimisation
- Purchasing Links
- Selling Links
- Comment Spamming
- Hidden Text
- Duplicate Content
- Keyword stuffing
- Manual penalties
- Sandbox effect (Probably the same as age of domain)
- No Follow Links
- Performance / Load of a website
- Speed of JS
- XML Sitemap (Aids the crawler but doesn’t help rankings)
- PageRank (General Indicator of page’s performance)
Welcome to WebmasterWorld!
The human editorial portion of the rankings really doesn't have anything to do with PageRank. PageRank is about links and the human editorial input is basically a + or - regarding the general usefulness of a result in a set of results.
I knew this was going on (and I'm actually glad, because IMO a hand review could easily benefit me / my sites) but the discussion we were having RE human input is if it's actually part of the ranking algo directly or another dimension of the overall results with a less immediate impact than the underlying mechanism, and it appears it might have a more immediate impact than I originally thought.
As far as Matt and his button, his actually says 'Spam', while Amit Singhal has the 'Like' or 'Not Spam' button in his office... Personally, I believe they have button wars which are ultimately settled over a game of asteroids, but this is my opinion only and might be a slightly controversial view of how rankings are actually determined by Google. Most of the other people I've talked to think they play interoffice, team Battleship to keep the SERPs more fair and balanced...
The one voting point was split among all the links (whether on-domain or off-domain). The voting point was retained, and to it was added all the other part-points from all inbound links. Each page now had 1+X voting points.
The process was repeated, but now some pages had more (possibly much more) voting power than others.
More iterations occured, until the sum of all the "points" on all the "pages" totalled the pre-defined limit.
Some dampening occurs, with less voting power arriving from a link than left the voting page. Indeed, the dampening and the upper limit are mathematically connected. Internal links are dampened more than external links.
This is not quite an accurate depiction of PageRank, but it is sufficient for my point, which is
PageRank does not, will not, never has, never will have anything to do with the content of a page
It does not speak of
- Correctness/Truth ("are we getting closer to getting the 200 factors")
- Authority ("is this subject something the site is renowned for")
- Trust ("is this site a penalty magnet")
- Content value of any kind
- Markup structure in any way
- Load Speed
- Server latency
- Page size in KB
- Keyword Density
- Anchor Text
- ANYTHING OTHER THAN A NUMERICAL REPRESENTATION OF HOW LIKELY A RANDOM SURFER IS TO LAND ON A GIVEN PAGE
Many, many other factors now piggy back on the PR mechanism (seed proximity, semantics, good/bad neighbourhood), but is not PR. And I'm talking actual PR, not the crappy Toolbar kind.
Apologies for the slight OT, but the Page Rank Vs Ranking Factor confusion has been made explicitly at least twice, and I dare say many more times by those who read without posting, or post without commenting on it
I bet a brewery that analyses it's beer measures 30 or 40 factors. Bacterial count, carbon dioxide content, age, tannins etc etc. What the customer cares about is that it looks nice with a nice head, tastes good and makes you pi$$ed after a few pints. It is similar with Google. My advice to all you desperate website owners is concentrate on 20 things that you can do something about and do them well. There are loads of free tools out there to help you.
joined:June 3, 2007
there are probably 20 or 30 that you are both able to affect and are worth the effort.
Absolutely, spot on, I just checked some of my #1 ranked pages and there 20-30 key elements I focus upon for every page and then let Google takes its course:-)
Of course, as has been started in another thread, the things one should not do as well!
Intent behind query ? Informational / Transactional / Navigational ?
Personalization turned on ?
Query IP ?
Keyword position in title ?
Keyword in filename
Image quality/resolution ( image search )
Tags, comments, (video search )
Social signals reinforcement ( RT’s, diggs, etc )
user interaction / satisfaction with results ?
joined:Mar 3, 2003
You don't win the prize unless you get all 200 in correct order.... happy!
I have seen authority sites (think BBC, CNN etc) get away with things that would be considered bad for small sites. I believe that TrustRank is an early decision branch, which if evaluates true, skips some of the factors that concern small sites.
One of the complexities with historical factors is that for certain queries the same factor )e.g. backlink growth) can work as either a negative or a positive.
Other factors not yet mentioned:
1. the number of IMPRESSIONS a domain receives in Google SERPs overall. Especially when that number spikes, it's been mentioned in patents as a possible spam signal.
2. the quality of advertisers that a site runs.
Google test new search page, featuring sidebar [webmasterworld.com]
It doesn't have the text in inbound links.
It doesn't have the text in the title.
It doesn't have the text in a heading.
It doesn't have the text in a link out.
The overall topic of the surrounding pages doesn't have anything to do with one of the subjects it ranks for.
All it really has is 'authority' without good content regarding the subject, without good relevant links on the subject, without a picture or ten on the page. It really has nothing to do with the topic, other than mentioning the words in small text on the page.
How many variables are people saying are probably actually important? I think you could probably narrow it down to 3 judging by the preceding... PageRank, TrustRank, Text on the Page.
...without good content regarding the subject, without good relevant links on the subject, without a picture or ten on the page. It really has nothing to do with the topic...
I think you could probably narrow it down to 3 judging by the preceding... PageRank, TrustRank, Text on the Page.
Pretty much all the factors fall into those three areas - but each one has a lot of detail to it.
The work Google did this past year with their so-called "intention engine" is still a frustration for many. Certain query terms are simply classified as a certain type of user intention. If a website type (according to Google's taxonomies) doesn't match up with that intention, then you can pretty much forget ranking that site on that query.
Some queries seem to have a "diverse" intention attached, and then there's more of a hope. I've been considering this intention engine obstacle more lately. It seems to me that when Google gets user intention right, then maybe that traffic wouldn't really do much good for the excluded types of sites even if they would send it.
It seems to me that when Google gets user intention right, then maybe that traffic wouldn't really do much good for the excluded types of sites even if they would send it.
This part is really interesting, because it might depend on the definition of 'do any good'. For instance: If it's an informational site that includes links to (possibly ads for) products and the 'intention' is regarded as shopping the traffic might very well have done them some good, even if it did not match the 'perceived intent' of the search...
The 'intent factor' might provide better results, but that doesn't really help the webmaster who's used to making a living off a site with a different 'intent' or alternate focus when compared with 'search intent'. I don't think it's 'wrong' of Google to try to get the search results right this way, but do think webmasters should prepare for some changes in traffic patterns and might very well need to make some adjustments if they plan to stay in the race.
Also: I guess what I was trying to say WRT only TrustRank PR and Text on the page is... Throw out the title, h1, page name, URL, and a bunch of other ish as 'super important', because obviously they're not. The original number 1 had a page with exactly the same structure as the 'wrong page' ranking and was in the same directory of the same site and only one click away. The non-ranking page had a better title, url, page name, text on the page, and overall content for the search, but did not rank.
Edited: Clarification, additions.
About the former number one page:
338 = Lines of CSS. (On Page)
365 = Line number of the <body>
1315 = Lines of Source Code.
7 = Words in the Title. (2 From the Search)
32 = Words in Headings.
164 = Words in plain Body Text.
439 = Words in Links. (Includes URLs)
644 = Total Words on the Page.
(644 Includes Source Code. 196 (or so) Viewable Words)
0 = Keywords, Description, Words from the Search in the Domain Name.
1 = Canonical Tag, <h1>, <iframe>
<h1> = 7 Words. Words from the Search 0.
2 = <h3>s, <h4>s, Google Ad Blocks (Small)
<h3>s = 7 Words Total. Words from the Search 2 of 4.
<h4>s = 4 Words Total. Words from the Search 0.
3 = <strong> (4 unique words; 1 from the 4 word search;)
4 = <p>s
5 = <h2>s (15 Total Words. Words From the Search 0.)
27 = <script>s
91 = <a>s
121 = <div>s
56 = Number of Images.
17.36k = File Size According to FireFox.
2 of 4 = Words from the search in the <title>
2 of 4 = Words from the search anywhere in the URL
2 = Links to the page you really wanted to find RE the search. (2 Out of 91)
About the links you wanted to find:
1 = Picture (1st Link to the correct page. Line 536)
1 = Text: 'Newer Post' (Line 907)
At Least 1 = Number of people who think it does a better job of satisfying the search and is a much better result for the search than the current number 1. (If you happen to own the page (site) this post is about sorry about your ranking. I didn't do it on purpose and think yours is a way better choice than the one it was replaced by. I actually had no idea it happened until someone else pointed it out...)
For some terms you just need to get one or two things right and you will rank well which is why folks who bother to research long tail terms find it so easy to rank for those terms.
In some areas you have to get the mix just right and do a couple of things better than the competition AND hope they don't analyse what you have done and spot it so they can continue the arms race.
The site had a similar page, with the main difference being a keyword which was actually in the search as the focus, making the non-ranking page much more appropriate for the search, yet it was not returned in the results. (IOW: The page ranking had 2 of the 4 keywords in the search, the not ranking page had 3 and the 3rd word changed the entire focus of the page, like the difference between sun and rain, cat and mouse, etc. The different word completely changed the information on the page to what the search was actually about.)
So, in this case you can forget about the #1 and #2 result comparison for a minute and compare the Page in the results to the Page Not in the results to see where the differences are (which must be factors other than the above since the pages were very similar, yet unique in the information they presented) and by doing this, IMO, you can eliminate some of the factors I listed above from the 'super important' list.
If all the factors listed above actually carried significant ranking importance the correct page from the site would have ranked rather than one which simply linked to it with poor anchor text and an image, because the correct page 'had all the right words in all the right places', so to speak...
joined:Sept 7, 2009
Like I have said before you can have a copy of the on page algo and your not going to rank very well without good backlinks. You spend a week working on your website and I will spend a week getting links and I will rank very far ahead of you. Everybody wants to worry about on page factors because that is easy to do. Link building is hard work.
You really only need to work on the basics. Don't waste a lot of time trying follow some on page formula. Build a website and have lots of good content with Good titles. Check analytics a lot to come up with new title and content ideas. Then build links until you pass out.