Forum Moderators: open
page rank
keyword in title
keyword in H1
keyword in URL
keyword density (total word count being considered)
keyword in title
keyword in links to page (anchor text)
keyword in bold / strong, etc.
keyword in other parts (full text, alt, title, meta description)
keyword proximity (if search for 2+ keywords)
keyword order (does or not order in page match order in query)
keyword prominence (how early in page/tag)
URL length
more guessing:
==============
absence of competitive keywords other than search term
[edited by: Brett_Tabke at 12:09 am (utc) on Aug. 30, 2002]
the argument about the relationship between filesize and keyword density makes very much sense. i meant this with the brackets text "total word count being considered".
the fact alone that a page is bigger than another one IMO doesn't affect ranking. the argument would be: google wants to educate us to design smaller pages while sacrifying search quality. as only very few people consider SEO at all this would not have much effect on file sizes. i interpret brett's statistics in a way that it reflects the distribution of different file sizes in the entire web. so i add your theory to the "guesses" ;-)
updated list:
=============
page rank
keyword in title
keyword in H1 and H2
keyword in URL
keyword density (total word count being considered)
keyword in title
keyword in links to page (anchor text)
keyword in bold / strong, etc.
keyword in other parts (full text, alt, title, meta description)
keyword proximity (if search for 2+ keywords)
keyword order (does or not order in page match order in query)
keyword prominence (how early in page/tag)
URL length
more guessing:
==============
absence of competitive keywords other than search term (muesli)
file size (brotherhood of LAN)
acceleration of LP / relation between LP and age (beachboy, slud)
themes
Definitely. If you search for something fairly competitive then you should see a change. For example "hotel in yourcountry" is different from "hotel yourcountry" is different from "yourcountry hotel".
There's a strong element that I can see, to some extent, but have never tracked down. I guess it's OK to post this example, given that it's Google's page anyway. Everything I know about Google tells me that this page [google.com] should be much higher up in a search for "solutions". Density, title, PageRank and inbound link text (from a bunch of the highest PR pages on the Web) all point to it being ranked very well, yet it's not in the top 150.
It might be an element of old-fashioned link popularity (i.e. how many domains link to you). Maybe it helps to get the right link text from more than one domain. I just don't know.
the search doesn't bring up google's page in the first 15 pages. the top results have everything to a lesser extent than google's page: keyword density, PR, etc. what they DO have is plenty of anchor text, all containing the word "solutions".
so in a later step, when we might add percentages to the list, anchor text should reflects this important role it seems to play.
muesli
And yet Google have a whole bunch of PR9 and even PR10 pages linking to it with that word. So is it a case of more links (or even links from more domains) with matching anchor text helps, irrespective of their PageRank?
So is it a case of more links (or even links from more domains) with matching anchor text helps, irrespective of their PageRank?
i'd say so, yes. PR certainly has its role but is maybe overestimated. many links on decent PR pages from many different domains IMO do the trick. may be the "time span" of the links play a role, too. very old links and very yound ones containing the KW are a perfect sign of relevance and continuity. for the #1 ranking the time span is definitely immense.
muesli
[google.com ]
That may be just because the one word phrase is more cometitive; even with the lack of weight for solutions that the others have, it's so overwhelmingly optimised (including backlinks) for the two word phrase that it gets over the 'barrier' (whatever that is).
updated list:
=============
page rank
keyword in title
keyword in H1 and H2
keyword in URL
keyword density (total word count being considered)
keyword in title
keyword in links to page (anchor text)
keyword in bold / strong, etc.
keyword in other parts (full text, alt, title, meta description)
keyword proximity (if search for 2+ keywords)
keyword order (does or not order in page match order in query)
keyword prominence (how early in page/tag)
URL length
more guessing:
==============
absence of competitive keywords other than search term (muesli)
file size (brotherhood of LAN)
acceleration of LP / relation between LP and age (beachboy, slud)
themes
number of different domains where keyword appears in anchor text (ciml)
"age span" of inbound anchor text where keyword appears (muesli)
anything we should add? has anybody made experiences with the guesses, so we can delete them or move them to the first list?
"Keyword in links to page" really means "keyword near links to page", so how about "keyword density and HTML title of the pages that link."?
That might be quite easy for Google to implement compared with contextual PageRank?
Now to get futuristic...
* Keyword density and HTML title of the pages that are linked from the pages that link to the page in question. (In other words, do the 'Similar pages' match the phrase?)
* Keyword density and HTML title of the pages that link to the pages that link to the page in question. (A very low-tech attempt at contextual PageRank).
* Title of the ODP category the page is in (and maybe the parent categories too).
* Title of the ODP category the pages that link to the page (and maybe their parent categories too).
* Full Contextual PageRank (i.e. calculate PageRank across the Web for each phrase. Not likely any time soon.)
* 'Topic Sensitive PageRank' (i.e. calculate PageRank across the Web for a few topics (eg. ODP), then match one of those topical PageRanks to the search phrase instead of the general PR - see Haveliwala's paper of that name.)
* Build the list of phrase hits using PageRank and on-page factors (like Google does) then use those pages to find the most authoritative (similar to Bharat's and Mihaila's Hilltop).
There are so many things for Google to try, but presumably the key is not just how well they work but how practical they are.
As for scale/technology/finance reasons, I think that we can be fairly sure that "Title of the ODP category the page is in" would have minimal implications while "Full Contextual PageRank" could not be done on a monthly cycle with current technology.
I don't know how long it takes Google to iterate PageRank, but I'd guess not too long (since they spend most of the cycle spidering and the indices must take a lot of crunching). But how many potential search phrases are there? It's currently unthinkable to iterate PageRank for each word combination or even each word, IMO.
Which is less expensive to calculate, Topic Sensitive PageRank or Hilltop? I'm sure that Google will have investigated both.
-Squared
BHoLAN, are you sure some of the guesses aren't part of the algo already? would we notice?
I was thinking much of what ciml mentioned would already be in action or in the pipeline.
I think people are expecting google to use more emphasis on a theme approach alongside current PR and one page rankings.
Even if they were in there and we weren't 100% sure, if we list a variable and ask why it could be of use and how effective that would be, then well, we've as good a chance as google at putting the jigsaw together :)
I've never studies G like others, though I'd hope that whatever-way the process their PR it will not be normalised into one simple value. There must be stages where values can be used again in the algo to help get the results.
ie. all the things that ciml mentions off-page. DMOZ is noted for its 'neutrality' and the fact that everything is human edited. This will no doubt be taken into account...and could be an additional factor in itself further along the algo if themes were to be implemented in a more mainstream way.
muesli, the way I see it, there are only a finite amount of ways they can make the SERP's relevant...after that its throwing away the ones the won't be using and figuring out how they use the rest :)
If Google decides to approach themes by using domains, then I'll agree with you 100% squared. My assumption has been that they won't, and that instead they'll use the link graph (as PageRank does). So a page about widgets would do well if it has links from authorities about widgets, not if just because it's on a domain about widgets.
My concern about ODP related methods is that a), it will favour only the exact URLs in the ODP (something makes me twitchy about that but it doesn't sound too problematic) or that b) the ODP entry would apply to the domain (that would be a major backward step IMO).
brotherhood:
> already be in action or in the pipeline
The pipeline possibility is probably why so many of us think in terms of theme pyramids, and not just PageRank and link text. Up until now, most of the approaches I've tried on the Web have ended up being useful for Google ranking some time afterwards. I want to keep it that way.
absence of competitive keywords other than search term
I am really, truly, sincerely doubting that Google cares about what SEO's think are "competititve keywords".
I'm also kinda doubting that there's a good metric for "competitivie". Some phrases aren't competititve, they're just popular. What's Google going to do, knock down any widgets page that happens to mention George Bush? That would be weird (and hugely disappointing to any widget company that gets endorsed by George Bush).
Anyway, what is a thread of complete guesses to accomplish, besides creating more urban legends and misinformed newbies?
part of the google algo:
========================
more guessing:
==============
i dropped "absence of other competitive keywords" as mbauser convinced me that google wouldn't be able to destinguish between competitive and popular. please let me know if you have anything to add in the first list.
1. Slight preferance for edu/gov domains
2. Already mentioned, but having the exact keyword phrase in inbound links. Ie if you are trying to optimize a page for "swimming suits", then having all inbound links with the keyword phrase "swimming suits" will have a better effect than "best swimming suits site on the web" (the importance of the keyword phrase will be diluted by the other words in the link).
3. I'm guessing that Google judges pages based on the outgoing links, especially for pages where there are lots of outgoing links, to make sure that they are of a similar theme/and penalizes sites that have lots of outgoing links to disparate sites ... could be measured by similar high keyword density in the sites you are linking too ... maybe that's a bit far fetched.
4. possible slight preferance for one word domain names that are exactly the same as the keyword. Eg if you are searching for "xyz" then google will give preference to www.xyz.com over www.xyzstuff.com
5. links from sites on the same ip address have less weight and may be penalized if too many cross links from different domains on the same ip.
In my very humble opinion, the most important factor is having the keyword phrase in inbound links.
If they are using the 'flavour' of "expert documents" ie DMOZ (and other places perhaps- like high PR pages) then it seems google will have more than a 100variables to play with...just a wonder of whats the most efficient AND effective ones.
Whole lotta "iteratin" if you ask me :)
Does anyone think their is more significance to DMOZ more than simply just having "bonus" from a link in their directory? I've always thought "number of clicks away from home page of DMOZ" is something that could be a measure of scale and importance...either way you'd think that google's reliance on DMOZ will be fairly heavy.
/sidenote
muesli, any chance you can seperate the compiling list into "on the page" - "PR related" and maybe even "theme related" ? :)
factors on page ("search solutions": 35%, "solutions": 25%)
================
1. density in body text (total word count being considered),
2. keyword in URL, URL length,
3. keyword in title, in ALT tag, in meta description,
4. keyword font size, in H1/H2 tags, bold/strong/italic tags,
5. two or more keywords:
- keyword proximity in body text
- keywords order (kw 1 ... kw2 or kw2 ... kw1)
6. keyword prominence (how early in page/tag)
factors on pages that link to page in question ("search solutions": 35%, "solutions": 40%)
==============================================
1. keyword in anchor text of links,
3. Open Directory related measures
5. keyword in body text near to link
6. keyword density and HTML title of the page
7. number of different domains/IPs where keyword appears in anchor text (i think all agree on this one)
PageRank ("search solutions": 20%, "solutions": 25%)
========
- link popularity (PR formula)
guesses ("search solutions": 10%, "solutions": 10%)
========
(see posts above)
those are first guesstimates of mine. what are yours?