Page is a not externally linkable
- Google
-- Google News Archive
---- Google's Florida Update - a fresh look


john316 - 7:09 pm on Dec 16, 2003 (gmt 0)


How semantics play with algorithms.

Semantics (the short story): Categorization of a document into a category (or set of categories).

I'm assuming for purpose of illustration two very broad categories i.e.: COMMERCE and NONCOMMERCE. It would be fairly trivial to identify words and phrases and patterns such as "buy", purchase","check out", "We accept all major","toll free" and identify/create a category called COMMERCE, likewise for NONCOMMERCE (default), you then assign your pages to a category.

Crude (very crude) algo.

# calculate score
weighting -- body<200> emph<500> title<1000> citation<500>

So, if a page had one instance of "widget" in its body with emphasis and the word "widget" was in the title and the page had one citation (inbound link) it would score 2200 for the term "widget".

Now throw in a category score:

# calculate score
weighting -- body<200> emph<500> title<1000> citation<500> commerce<0> noncommerce<400>

With the commerce weight thrown in, if the widget page isn't in the category "commerce" it now scores 2600 for the term "widget". If the widget page is in the "commerce" category it scores 2200.

Pretty basic stuff.

[edited by: john316 at 7:45 pm (utc) on Dec. 16, 2003]


Thread source:: http://www.webmasterworld.com/google_archive/20566.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com