homepage Welcome to WebmasterWorld Guest from 54.161.155.142
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 74 message thread spans 3 pages: 74 ( [1] 2 3 > >     
Google's 2 rankings & you
New patent means new way of ranking
claus




msg:48605
 12:15 am on Jul 8, 2003 (gmt 0)

Continued from earlier thread: [webmasterworld.com...]

The new Google patent

In this thread of a similar name [webmasterworld.com] zafile pointed me towards a news.com article on a new Google patent. The link in the news.com article turned out to be wrong, but i found the patent anyway. Specifically, it is this one:

6,526,440 Ranking search results by reranking the results based on local inter-connectivity

You can find the details at the US Patent & Trademark Office, Patent Full-Text and Image Database [patft.uspto.gov] - search for 6,526,440. Text from this patent - published to the public domain by USPTO - is also quoted in this post, where necessary to make a point. There are no copyrights on patent texts, but i have tried to keep quotes to a minimum anyway.

As the title suggests, this is a patent that deals with ranking and re-ranking results.

The patent was granted on February 25, 2003, and filed January 30, 2001 - so Google researchers have known about it for at least two years already. Still, a patent grant means that the source description is published. This is the reason for (as well as the "Google News" of) this post.

I have spent a few hours studying it, and it clearly has implications for users of this forum. I'll get to the nitty-gritty of it, but let me point out the major points first.

It's not an easy read. And there are 7 unknowns as well as some scope for flexibility and judgement (either by trial-and-error or by manual or automated processes). It's really interesting though.


What is it?

It's a patent. Nothing more and nothing less. A description of some procedure for doing something. This does not mean that it will ever be put to use, as lots of patents are granted and never used. Patents don't come with a release date, but some elements of the confusion we are seeing now could be explained by this.

Chances are, however, that this one will be put to use. Having spent a few hours on it, i must say that it makes some sense. It is intended to provide better and more relevant results for users of the Google SE, and at the same time (i quote the patent text here) :

... to prevent any single author of web content from having too much of an impact on the ranking value.

Sounds serious, especially for the SEO community. And it probably is, too. But don't panic. Notice that it says "too much of an" and not "any". It's still a ranking derived from links, not a random Google rank.


What does it do?

We know about the Page Rank algorithm. This is the tool that Google uses to make sure that the pages it has indexed are displayed to the user with the most important pages first. Without being too specific it simply means that, for each and every page Google calculates some value that ranks this page relative to the other pages in the index.

This is something else. Rephrase: This is the same thing plus something else. It is, essentially, a new way to order the top results for any query.


The ultra-brief three-step version:

What the new patent implies is a ranking, then a reranking, then a weighting, and then a display. It goes something like this:

1) The usual pagerank algo (or another suitable method) finds the top ranking (eg.) 1000 pages. The term for this is: the OldScore.

2) Each page in this set is then going through a new ranking procedure, resulting in the LocalScore for this page.

3) Finally, for each page, the LocalScore and the OldScore are normalized, assigned a weight, and then multiplied in order to yield the NewScore for that page.

In this process there will actually be "two ranks", or rather, there will be three: The initially ranked documents (OldScore ~ Page Rank), and the reranked documents (LocalScore ~ Local Rank). The serps will still show only one set of documents, but this set will be ranked according to the "NewScore ~ New Rank" which is (sort of) a weighed average of PR and LR.


Confused?

Don't be confused by the fancy words. It's more straightforward than it seems. In other words, this is what happens:

a) you search for a keyword or phrase - as usual
b) pagerank finds the top 1000 results (or so) - as usual
c) localrank calculates a new rank for each page - this, and the rest is new

d) each page now has two ranks (PR and LR)

e) the two sets of ranks are multiplied using some weights.
f) the multiplication gives a third rank.

g) each page now has one rank; the NewRank (sort of equal to PR times LR)

h) pages are sorted according to the NewRank
i) and finally displayed with the best "NewRanking" ones on top.

- better?


What does it mean to me then?

Well, if you are one of the few who knows all about chea.. hrm... optimizing for Google, then it means that the world just got a bit tougher. And then again, perhaps not, there are still tricks in the bag for the very, say, experienced. Nuff said, just a feeling, i will not elaborate on that.

It will become harder, it seems. If not for anything else, then only because you now have to pass not only one, but two independent ranking filters. In stead of optimizing for PR you will now have to optimize for both PR and LR.

Let's assume, as a very simple example only, that values 0,1,2 are the only values for both PR and LR: If You get a PR of 2 and a LR of 0, then the NewRank will be 0. If you get a PR of 0 you will not even make it to the top set of 1000 that will ever get a LR calculated. On the other hand, if you get a PR and a LR of 1 then you're better off than the guy having a top PR but no LR.


Got it - what's that LR thing then?

It's a devise constructed to yield better results and (repeat quote):

... prevent any single author of web content from having too much of an impact on the ranking value.

I have been looking at the patent for a while and this intention could very well be inforced by it. That is, if "authorship" is equal to "domain ownership", or "some unspecified network of affiliated authors".

Here goes:

The LocalScore, or Local Rank, is both a filter and a ranking mechanism. It only considers pages among the 1000 or so selected by the PR.

a) The first step in calculating Local Rank for a page is to locate all pages that have outbound links to this page. All pages among the top 1000 that is.

b) Next, all pages that are from the same host as<tis page or from "Similar or affiliated hosts" gets thrown away. Yes. By comparing any two documents within the set, the one having the smallest PR will always be thrown away, until there is only one document left from the (quote) "same host or similar" as the document that is currently being ranked.

Here, "same host" refers to three octets of the IP. That means the first three quarters of it. In other words, these IPs are the same host:

111.111.111.0
111.111.111.255

"Similar or affiliated hosts" refers to mirrors or other kinds of pages that (quote) "contain the same or nearly the same documents". This could be (quote) "determined through a manual search or by an automated web search that compares the contents at different hosts". Here's another patent number for the curious: 5,913,208 (June 15, 1999)

That is: Your on-site link structure means zero to LR. Linking to and from others on the same IP means zero. Near-duplicate pages means zero. Only one page from your "neigborhood", the single most relevant page, will be taken into account.

c) Now, the exact same procedure is repeated for each "host" in the set, until each "host/neigborhood" has only one page left in the set.

d) After this (tough) filtering comes another. Each of the remaining pages have a PR value that they are sorted according to. The top k pages pass, the rest gets thrown away. Here "k" is (quote) "a predetermined number (e.g., 20)."

So, although you positively know that you have 1,234 inbound links, only the top "k" of these, that are not from "your neighborhood" or even "part of the same neigborhood" will count.

e) The remaining pages are called the "BackSet". Only at this stage can the LR be calculated. It's pretty straightforward, but then again, the filtering is tough (quote, although not verbatim, but deviations keep current context):

LocalRank = SUM(i=1-k) PR(BackSet(i))m

Again, the m is one of those annoing unknowns (quote): "the appropriate value at which m should be set varies based on the nature of the OldScore values" (OldScore being PR). It is stated, however, that (quote) "Typical values for m are, for example, one through three".

That's it. Really, it is. There's nothing more to the Local Rank than this.


What about the New Rank then?

This is getting a very long post you know... Well, luckily it's simple, the formula is here, it's as public as the rest, you can't have a patent that's also a secret (quote):

NewScore(x) = (a+LocalScore(x)/MaxLS)(b+OldScore(x)/MaxOS)

x being your page
a being some weight *
b being some weight *
MaxLS being maximum of the LocalScore values, or some treshold value if this is too small
MaxOS being maximum PR for the original set (the PR set)

* Isn't this just beautiful (quote): "The a and b values are constants, and, may be, for example, each equal to one"


Wrap up

Inbound links are still important. Very much so. But not just any inbound links, rather: It is important to have inbound links spread across a variety of unrelated sources.

It could be that blogger sites on blogger.com, tripod sites, web rings, and the like will see less impact in serps from crosslinking. Mirror sites, and some other types of affiliate programs, eg. the SearchKing type, will probably also suffer.

My primary advice from the algebra right now is to seek incoming links from "quality unrelated sites" yet still sites within the same subject area. Unrelated means: Sites that are not sharing the first three quarters of an IP or are in other ways affiliated or from "same neigborhood" (at the very least not affiliated in a structured manner). Quality means what it says.

Links from direct competitors will suddently have great value, as odd as it sounds.

Note: Spam. Cloaking. Shadow domains. Need i say more? I'm sure they will fix that anti spam filter sometime, though.

Candidate for longest post ever... here goes, let's see if the mods like it.

/claus

[edited by: Brett_Tabke at 12:01 pm (utc) on July 8, 2003]

 

SinclairUser




msg:48606
 12:38 am on Jul 8, 2003 (gmt 0)

Nice post claus!

So it looks as though we may have one more hoop to jump through.

Does this mean that direct competitors will all be fighting to swap links?

Chndru




msg:48607
 12:44 am on Jul 8, 2003 (gmt 0)

Read the patent and it makes sense.
Question is, does any of the SERPS obey this? Wouldn't it be easier to check if there are SERPS (say in the top 20) from the same IP subnet for a typical keyword?
Thanks

needhelp




msg:48608
 12:48 am on Jul 8, 2003 (gmt 0)

Just wanted to say thanks for taking the time Claus!

claus




msg:48609
 1:00 am on Jul 8, 2003 (gmt 0)

Thx for the feedback, it was not easy translating to ordinary terms and keeping the logic :)

SinclairUser
>> direct competitors will all be fighting to swap links?

...seems unlikely doesn't it? although it's tempting, math doesn't always rule human behavior ;)

Chndru
>> if there are SERPS (say in the top 20) from the same IP subnet

...well, there can be. We don't get to see the LR, only the NR. If "same-IP-pages" each have good PR AND good LR, then they'll make it to the SERPS.

Remember, the calculation of LR is done for each of the 1000 pages in the primary set, even for those from the same IP. The LR is an "extra check", its not necessarily the most important, the "a" and "b" and "MaxXX"s decides what wins.

/claus

<edit>clarified some. it's three am here 'night to all</edit>

[edited by: claus at 1:04 am (utc) on July 8, 2003]

pageoneresults




msg:48610
 1:04 am on Jul 8, 2003 (gmt 0)

claus, you are to be commended! To take the original patent and summarize in a way that most of us will understand is truly a task. I too thank you for taking the time to present this type of information to the members of WebmasterWorld.

GrinninGordon




msg:48611
 1:19 am on Jul 8, 2003 (gmt 0)

Good work and reading claus - sincere thanks

It might explain a lot that has been going on;

1) Sites that exchange links with totally unrelated subject sites fair better.

2) Owners that have several sites for the same thing that cross link them wisely do better.

This certainly mirrors what I am seeing in one highly competitive area. But is it a good thing for the punters..... I think not at the moment. Google need to do some serious tweaking.

Although, to be honest, if Google have converted from a 6 weekly static index to a rolling one, I am amazed if they incorporated other elements at this time. My money is still on missing data.

Mohamed_E




msg:48612
 1:31 am on Jul 8, 2003 (gmt 0)

How does this differ from [webmasterworld.com...]

jk3210




msg:48613
 2:25 am on Jul 8, 2003 (gmt 0)

Best post I've read in a l-o-n-g time.

Krapulator




msg:48614
 7:31 am on Jul 8, 2003 (gmt 0)

Very Interesting! This could possibly explain some of the craziness that has been going on in the last month or so.

The concept described seems like a sensible step forward in maintiaing the integrity of the SERPS.

Boaz




msg:48615
 8:06 am on Jul 8, 2003 (gmt 0)

Interesting - yes. But think of the load it will put on Google to do such a check for every query - if implemented, serps should take quite a while to show up for every query - and this is not happening. So, though it could be an explanation for what we're seeing, I don't think it can be actually what is happening.

Mineral




msg:48616
 8:12 am on Jul 8, 2003 (gmt 0)

I don't think this patent is currently applied to Google searches. It requires a great deal of on the fly calculations that can not be done in the milliseconds that it takes Google to complete a search.

In other words, each time a search is performed Google needs to do so many online calculations before it produce a SERP that it can take at least several seconds, making the process described in the patent not feasible for current processing power of Google or anyone else. I think it is something for the future, not for now.

Nick_W




msg:48617
 8:18 am on Jul 8, 2003 (gmt 0)

Thanks for the fantastic summary. What do the pro's make of this? (meaning the ones I know are pro SEO anyway ;))

Nick

Clark




msg:48618
 8:47 am on Jul 8, 2003 (gmt 0)

Very nice work Claus.

I don't think this would be very tough for google to do at all on the fly. If I remember correctly the entire database is in RAM. The processing power to get the initial result set is probably a great deal more intensive than re-sorting on this basis for a smaller result set.

Think about it this way, instead of going through millions(billions) of records to get a result of 1000, G will only need to sort through 1000 to recalculate PR for those 1000 resulting pages. I could be wrong but I think it's a piece of cake.

Nick, what this means for the SEO is that you have to pay for new domains and new hosts and host at a different place for each domain or find a way to get lots of different IP ranges for the same company or maybe a trick for your own server...though G may have some way to detect if it comes from the same datacenter. Then do heavy crosslinks ON THE SAME TOPIC w/o getting caught of course. Those that try to SEO from one or two ded. server are ded. in the water.

This raises the stakes big time. SEO just got much much tougher. It will benefit the serious pros big time because they will probably do what it takes... but it takes certain aspects of SEO out of the hands of the average webmaster.

It would have been even more effective if Claus hadn't found the patent...but Thank God he did so we don't have to waste time SEOing the wrong way :)

The only thing left to do is develop Link Campaigns within your theme.

MOOSBerlin




msg:48619
 9:04 am on Jul 8, 2003 (gmt 0)

Great and interesting post Claus, and thanks!

amazed




msg:48620
 9:09 am on Jul 8, 2003 (gmt 0)

just keep in mind everybody before you start working on it, that this is one way of doing things and there are many others...

mil2k




msg:48621
 9:18 am on Jul 8, 2003 (gmt 0)

Great post claus. I had read that patent months ago but your research is excellent. Some new things I learned.

If You get a PR of 2 and a LR of 0, then the NewRank will be 0

Now I thought the value Started at minimum of 1?

what this means for the SEO is that you have to pay for new domains and new hosts and host at a different place for each domain or find a way to get lots of different IP ranges for the same company or maybe a trick for your own server...though G may have some way to detect if it comes from the same datacenter. Then do heavy crosslinks ON THE SAME TOPIC w/o getting caught of course. Those that try to SEO from one or two ded. server are ded. in the water.

I thought that was open knowledge ;) I know some Gurus around here who already do this :)

heini




msg:48622
 10:15 am on Jul 8, 2003 (gmt 0)

Good job, Claus - mange tak!
That's very interesting stuff.

I would have to agree with amazed though: we do not know if any of this is in use or will be in use in it's entirety any time soon.
I'm sure lots of people are already looking, comparing, evaluating current results against the described mechanisms.

Small annotation:

Spreading out a site's references as far as possible over the web has been recommended practice for a long time - just look at NFFC's latest. That's where quality directories come in for example.

Mineral




msg:48623
 11:13 am on Jul 8, 2003 (gmt 0)

From the indications that I have I steel think that this patent is not currently implemented.

Brett_Tabke




msg:48624
 12:07 pm on Jul 8, 2003 (gmt 0)

The original story on the new Google patent was broken by MSGraph in the research forum here. The cnet story appeared a few days later.
[webmasterworld.com...]

vitaplease




msg:48625
 12:27 pm on Jul 8, 2003 (gmt 0)

Nice work Claus.

as discussed in the thread Mohamed referenced to, it was more or less acknowledged as being a variation on Kleinberg.

The big question was if such an additional reranking could be done on the fly.

(OldScore ~ Page Rank).

I like your presentation but I would try to avoid referencing Oldscore with Pagerank or PR. OldScore(x) refers to the relevance score value for the particular document. I know you mean the Pagerank algo, but it can get confusing.

Let's assume, as a very simple example only, that values 0,1,2 are the only values for both PR and LR: If You get a PR of 2 and a LR of 0, then the NewRank will be 0. If you get a PR of 0 you will not even make it to the top set of 1000 that will ever get a LR calculated. On the other hand, if you get a PR and a LR of 1 then you're better off than the guy having a top PR but no LR.

NewScore(x) = (a+LocalScore(x)/MaxLS)(b+OldScore(x)/MaxOS)

Ok in the formula they state a and b be constants, so NewScore would most likely not be 0 if either LS or OS are 0, but you are making a good point.

Lets say a guestbook spammer gets high ranking on a query (oldscore - not necessarily Pagerank) only from totally unrelated sources (not in the initial set of e.g. 1000 results). NewScore would then let rankings for this page drop dramatically because of the multiplier effect from the non-existant LocalScore.

What this paper basically says is, for queries you want to rank high for,
focus to get motivated links from pages ranking in the top of the search engine results for that search query.

djgreg




msg:48626
 12:36 pm on Jul 8, 2003 (gmt 0)

What this paper basically says is, for queries you want to rank high for,
focus to get motivated links from pages ranking in the top of the search engine results for that search query.

But why only counting links from sites which rank high in the top of the search engine results for that query?
Links from sites which may have another topic should be counted with the same weight, shouldn't they?

I don't believe, that links from sites with the same topic are more relevant or should be given more weight.

dragonlady7




msg:48627
 12:43 pm on Jul 8, 2003 (gmt 0)

Man... my head hurts.
Thanks for explaining that. I have a lot of digesting to do before I can really understand it all...
So does the thinking go that this new ranking system will cancel a lot of blog noise and so on?
But you're right, it does seem that now only the really big boys can compete in really-intensive SEO. I'm not sure I like it.
Well, my opinion is thoroughly unimportant. ^.^ I'll go formulate it quietly somewhere else.

vitaplease




msg:48628
 12:45 pm on Jul 8, 2003 (gmt 0)

I don't believe, that links from sites with the same topic are more relevant or should be given more weight.

I guess Google thinks that, on the whole, it would increase relevance of search results by giving those links more weight. Would you not agree?

The indiscriminate guestbook signing I mentioned is just one example where it would improve ranking relevance of results, would it not?

Same goes for reciprocal linking for reciprocal link-pop reasons alone.

mipapage




msg:48629
 1:24 pm on Jul 8, 2003 (gmt 0)

Great work claus!

Thanks for deconstructing...

Seems to me that the more I learn, the more Brett's advice rings true (content, quality and consistency being very important).
- i.e. the more things change the more they stay the same?

Just hard to build lots of quality content for some sites.

I see how though Google is making it so that 'large' resource based sites with lots of information etc. will do better than your typical commercial sites, no? They have content, which attracts (relatively high) quality links, which still seem to be king here, just 'diversified' links..

So off we go to Adwords...

swerve




msg:48630
 2:08 pm on Jul 8, 2003 (gmt 0)

Claus, I read the original thread on this patent 6 months ago, but your summary is great for those of us (like myself) who are not math experts.

Inbound links are still important. Very much so. But not just any inbound links, rather: It is important to have inbound links spread across a variety of unrelated sources.

The last part is not correct. (Actually, it is correct only if "unrelated" is very narrowly defined as pages in the same IP range.)

One of the most important aspects of this patent is that is very important to have inbound links that are highly-related, provided they are from external sources.

For example, suppose you have a page about Blue Widgets. Now suppose you get a PR9 link from another page. That PR9 link will only get counted (for LocalRank purposes) for search queries in which it is returned, along with yours, in the top X initial results. So if the PR9 site is completely unrelated to your Blue Widgets page - that is, it does not get returned in the same intitial set for your key search queries - it won't get counted at all in calculating the LocalRank for that query. So it is very important that links are highly-related to your targeted search terms.

Some might describe this as one approach to "theming", in which links from pages on the same "theme" are given more weight than links from unrelated pages.

manilla




msg:48631
 2:31 pm on Jul 8, 2003 (gmt 0)

The link to the patent was an excellent read Claus, and your summary superb. Thank you.

claus




msg:48632
 2:35 pm on Jul 8, 2003 (gmt 0)

...lots of response...didn't anticipate that, but thanks again :)

Here's a follow-up, and answers to the things i think i am able to answer. I have put in some effort on the matter, but of course i do not know all there is to know about it.

I do believe, though, that the original post is as close and true to the original patent text as possible, only using different words and elaborating on some points that are not normally part of a patent text (but relevant here). I have four comments and one point that i didn't quite state in the post. Number one is the only point where i have found that the original post is not precise enough.



Comments and corrections to original post

1:
At stage (b) in the LR calculation, all pages from the "same neighborhood" as the one that rank is being calculated for gets thrown away. The filter does not keep one extra "internal" page.

In the original post, one could get the impression that one extra page was kept from the same neighborhood as the page that LR is being calculated for. This is not so. LR only deals with inbound links on pages from separate "neighborhoods" (one top ranking page per neighborhood). The process for selecting these "top notch" pages is exactly as the post says in (b).

2:
The stages b+c need not necessarily happen for each and every query. They might be done "in advance" so to speak, by assigning some identifier of "neighborhood" to pages in the main index at regular intervals. The patent text does leave room for some creativity like this, and it would speed up the process a bit and reduce server load. Anyway, the people at Google certainly know more about optimizing server load than me ;)

3:
Implementation: As i said in my answer to Chndru, we would never know. As searchers we only get exposed to the NewRank. The PageRank and LocalRank is in Googles inner workings. For what it's worth, they could have used this procedure since January 2001 or they might never even apply it, but i don't really think any of these options are correct.

Are they implementing it now, with all that fuss going on? I can only answer this: Definitely maybe. And then again, maybe not.

4:
I really need to repeat this: The patent text does leave room for some creativity. It's a method, a way to do things, just like making a cup of coffe: Sometimes it's black, sometimes you use (0-1-2-more) sugar, sometimes you pour in the (heated/cold) milk (first/last)... and so on. When you've found out what works best for you, you usually stick to some favorite way of doing it, though.

Does this mean that PR is no good?

No. It means that PR possibly can carry less weight, but it is certainly not a statement that PR has lost it's value. On the contrary, it says: PR is the best method ever for finding a set of relevant documents. Once this set is found, however, we need an "extra something" to re-rank that set, so that the most relevant pages get even more relevant. The "extra something" is LR.


Mohamed_E
>> How does this differ from [webmasterworld.com...]

- it differs in one major way: The inner workings of the whole patent is explained in ordinary everyday words, and doubt about "which parts means what" is (i hope) removed. This had not been done before, it is the "Google News" of the matter and what qualifies for a thread of its own under this headline. I did not intend to make duplicate threads, i'm sorry about that.

..ah. I see it's linked now. Thanks Brett :) I have to add a little to this post now, though.

Boaz, Mineral
>> think of the load it will put on Google to do such a check for every query

- see 2 above. vitaplease also pointed this out in the related thread, althoug zeitgeist doesn't identify neighborhoods (these are key to sort pages and reduce the set, relevance gets computed afterwards, once the neighborhoods have been cleaned). Clark also has a good point on this. And then there's the patent number "for the curious" that i dumped in the original post, of course.

Oh, and it's not just claim 12 in the patent that is the method, it's claims: (1-10),11,12,13,14. Plus the flexibility of proper choice of words inside the patent. There's always more than one way to do things.

Clark
>> what this means for the SEO is that you have to pay for new domains and new hosts

- your post is great, i didn't dare to write the pro/average part myself ;) Grumpus has a good observation on doorways in the related thread though. gopi as well on IPs, vitaplease on "artificial reciprocal linkage stuff", and Marcos on niche ISPs.

>> It would have been even more effective if Claus hadn't found the patent

- i doubt it. The method is primarily a ranking system - it's not a spam filter, although it seems able to eliminate the effect of some strategies. But, if we can see the holes so can Google, and .. well, a dedicated spam filter is outside the scope of this patent. It can easily be applied on top of it.

ciml also has a very good point here (from the related thread):

... people who create multiple sites for the purposes of linking are more likely to have ensured separate IP addresses (or class C ranges) than experts who happen to use the same provider (which is likely for some academic and geographical topics). The thing that worries me about these methods is the requirement for non-affiliated sources; smaller initial document sets would be easier to dent IMO

The thing countering this is that the "baseline" is always PR. PageRank rules. Within the result set that PR finds, results are re-ordered using not only Local Rank, but Local Rank multiplied by Page Rank.

And more: A key component in computing LR is PR - the top k pages that are chosen for the LR are simply ranked by PR. Of course, PageRank can be neutralized by setting the "b" weight to zero. The patent says that they: "may be, for example, each equal to one"

I think the key is to remember that LR does not stand alone. PR is always there. It's the combination of PR and LR that gives the NewRank, not one of them. This fact was not as visible before.

Amazed
yes, there is always more than one way of doing things, both for SEs and SEOs ;)

Mil2K
>> Now I thought the value Started at minimum of 1?

- the figures were only chosen to make the point, that a combined focus on PR and LR can pay off better that a narrow focus on one of these. For this purpose, zero is always a good number.

The patent text does not explicitly mention a minimum number such as 1 (at least, i can't find it). However, the algebra speaks for itself. Consider this formula again:

LocalRank = SUM(i=1-k) PR(BackSet(i))m

A local rank of 0 implies that the sum of, say, squared (m=2) pageranks for all k (say, 20) pages is equal to zero. Thus the square of PR for each page must yield zero. For that to happen, the PageRank for each and every page has to be zero.

heini, mineral
>> we do not know if any of this is in use or will be in use (...) any time soon.

-nope, and we probably never will unless it gets confirmed in some way, somewhere, some time, see 3 above. Grumpus has an opinion (in the related thread) that it has been implemented since Christmas. In the same thread, Markus suggests that this patent is just some spare time invention of Mr. Bharat. And then Namaste points out that it appears to have been implemented since November 02.

Here's a good Grumpus quote from the related thread:
Since there's no way of visually verifying exactly what the LR factor is for any page on any search they could very will do many things that people wouldn't even suspect or notice

Well, i hope i have stated by now, that only Google knows for sure.

xerxes (in related thread):
>> if a page has received a significant increase in visitors during a given period of time

This has nothing to do with traffic. I've been trying to identify what you refer to as paragraph 0037 but i can't. I can say so much, though, that links count and traffic does not (for one thing, there is no way for Google to monitor traffic on your site.)

takagi (in related thread)
On monitoring G toolbar use for ranking purposes

I still can find nothing about any use of any individual page (with or without toolbar use) - all the patent mentions is links. Still, i have often thought about how interesting it would be to add some measure of user behavior to the equation.


way too long post, but i had to sum up a bit, as this thread was (and wisely so) linked to the other. Meanwhile, i see that we are no longer at msg #19 as when i started writing the post. Well, i can't continue this one post forever, off it goes.

Oh.. i have to add this: Brett didn't change my post with the edit, he just removed some text on the top that was intended for the moderators only, and put a link there in stead. Thanks again, Brett :)

/claus

<edit1>The tagaki quote was not a quote, added "On" and removed ">>"</edit1>
<edit2>Removed some weird "part: " that had sneaked into the text somehow</edit2>
<edit3>Only typo's. Sure i didn't catch all though.</edit3>
<edit4>Had to fix this, it was simply wrong: "values that Google provides as examples are 1,2,3." This is for "m" not for "a" or "b". Definitely no more edits to this post now.</edit4>

[edited by: claus at 4:22 pm (utc) on July 8, 2003]

manilla




msg:48633
 3:04 pm on Jul 8, 2003 (gmt 0)

Think I might set up a cartel of the top k sites for my desired search term.

I'll call it the "k club".

I hypothesise that each member of the k club would have a high percentage of the other (k-1) members as part of their Backset.

Members agree to link to each other, and provided members of the k club are selected correctly (not similar sites), then we'd have a great cartel - a cosy closed club, where members have neutralised the effect of LR, and only need to focus on PR as before.

Every month, the first (k-1)th members of the k club, placed in order of New Score can vote the kth member out of the club, should the kth member slip down the serps, (possibly because its basic PR has fallen) and a new member could be invited into the club, according to their overall serp. The aim is to preserve the aggregate utility of the k members of the club, and to identify potential new members who have very good basic PR.

:-)

merlin30




msg:48634
 3:21 pm on Jul 8, 2003 (gmt 0)

And once Google has ascertained the existence of your k club you'll get the SearchKing treatment!

This 74 message thread spans 3 pages: 74 ( [1] 2 3 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved