Forum Moderators: open

Message Too Old, No Replies

Google's Published Patent APPLICATIONS

         

rubble88

8:57 pm on Feb 27, 2003 (gmt 0)

10+ Year Member



Since Google was awarded it's first patent (the PageRank patent is held by Stanford Univ) here's a quick look at 3 published patent applications that the company has in the USPTO system.

Published 9/19/02

"Methods and apparatus for providing search results in response to an ambiguous search query
[appft1.uspto.gov]
"Methods and apparatus consistent with the invention allow a user to submit an ambiguous search query and to receive relevant search results. In one embodiment, a sequence of numbers received from a user of a standard telephone keypad is translated into a set of potentially corresponding alphanumeric sequences. These potentially corresponding alphanumeric sequences are provided as an input to a conventional search engine, using a boolean "OR" expression, and the search results are presented to the user. The search engine effectively limits search results to those in which the user was likely interested."

Published 9/5/02
"Methods and apparatus for employing usage statistics in document retrieval" [appft1.uspto.gov]
"Methods and apparatus consistent with the invention provide improved organization of documents responsive to a search query. In one embodiment, a search query is received and a list of responsive documents is identified. The responsive documents are organized based in whole or in part on usage statistics."

Published 4/11/02
"Methods and apparatus for using a modified index to provide search results in response to an ambiguous search query" [appft1.uspto.gov]
"A system allows a user to submit an ambiguous search query and to receive potentially disambiguated search results. In one implementation, a search engine's conventional alphanumeric index is translated into a second index that is ambiguated in the same manner as which the user's input is ambiguated. The user's ambiguous search query is compared to this ambiguated index, and the corresponding documents are provided to the user as search results."

hakre

9:30 am on Feb 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



nice research rubble88,

i think you should add the new patent, too. ;)

Powdork

9:54 am on Feb 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What is the new one? I thought I saw another thread "Google's new patent" or something like that earlier, but I didn't read it. But now I can't find it. Whats up?

EBear

10:30 am on Feb 28, 2003 (gmt 0)

10+ Year Member



It's at

[webmasterworld.com...]

egomaniac

3:29 pm on Feb 28, 2003 (gmt 0)

10+ Year Member



> "The responsive documents are organized based in whole or in part on usage statistics."

This sounds like the old DirectHit usage popularity mechanism. Anyone know of any evidence of Google using this?

I know just because they get a patent, doesn't mean they'll use it any time soon (if ever). But this is quite interesting.

gopi

3:43 pm on Feb 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> "The responsive documents are organized based in whole or in part on usage statistics."

I think this patent application is for their adword ranking method as it takes the CTR into account

egomaniac

4:42 pm on Feb 28, 2003 (gmt 0)

10+ Year Member



That makes sense gopi. I forgot since I am not an adwords user.

Thanks.

jeremy goodrich

5:02 pm on Feb 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Could be for the regular SERP's as well, since they do statistical sampling of click thru's.

Also, consider the toolbar -> that is a form of 'usage' and those pages that are visited more often / by more unique toolbar users could be pushed higher...

Just some thoughts :) The usage data doesn't necessarily have to be adwords, but it could be too.

gopi

2:47 am on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>> The usage data doesn't necessarily have to be adwords, but it could be too

Maybe , but personally i dont think Google will use click thro data to rank sites in the regular SERPS because they have more sophisticated technologies that that ...

jeremy goodrich

3:31 am on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, it wouldn't really make sense to use click through only...but, if they rolled that into the algo in some fashion, how would we know? :)

gopi

4:06 am on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If they use CTR as a part of algo they should use tracking url in SERPS all time , so even in the unlikely case they begin to use CTR we can eaily find it out

jeremy goodrich

4:13 am on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes, but using CTR all the time would make the results slower.

Speed is very important to Google, and to surfers.

They do use statistical sampling as a way of measuring 'quality'...so, they already *are* using the CTR on the SERP's after a fashion to rerank results.

GoogleGuy said it, so it must be true, right? :)

But -> you are correct. This could be *just* a patent related to adwords, and I'm reading too much into it...

<edited to clarify>

vitaplease

6:34 am on Mar 2, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



[0035] In one implementation, documents are organized based on a total score that represents the product of a usage score and a standard query-term-based score ("IR score"). In particular, the total score equals the square root of the IR score multiplied by the usage score. The usage score, in turn, equals a frequency of visit score multiplied by a unique user score multiplied by a path length score.

[0036] The frequency of visit score equals log2(1+log(VF)/log(MAXVF). VF is the number of times that the document was visited (or accessed) in one month, and MAXVF is set to 2000. A small value is used when VF is unknown. If the unique user is less than 10, it equals 0.5*UU/10; otherwise, it equals 0.5*(1+UU/MAXUU). UU is the number of unique hosts/IPs that access the document in one month, and MAXUU is set to 400. A small value is used when UU is unknown. The path length score equals log(K-PL)/log(K). PL is the number of `/` characters in the document's path, and K is set to 20.

from:Methods and apparatus for employing usage statistics in document retrieval

I could be confused, but since when is the number of '/'charcters relevant to any "score" of a document?

daroz

6:51 am on Mar 2, 2003 (gmt 0)

10+ Year Member



I could be confused, but since when is the number of '/'charcters relevant to any "score" of a document?

(Nice post, btw)

As you get deeper into a directory tree, when it inherits pagerank from 'above', the PR decreases.. I.E.

www.example.com/index.html = PR5

www.exmaple.com/some/directory/here/index.html would probibly be ~PR3 (but almost certanly lower then 5)

j_anstice

10:36 pm on Mar 2, 2003 (gmt 0)

10+ Year Member



The big difference between the Google usage scheme and the Direct Hit click popularity scheme is that the Google version is based on site usage overall, whereas the DirectHit method is based on linking search terms to search results.

This means that Google can have a single 'site usage score' for each page in their index, but DirectHit requires a (more complicated and harder to maintain) term-page matrix. Google can build score from various sources (like Google Bar usage, cached page stats, blogger logs & buying web logs from large ISPs like those tracking companies) while Direct Hit really requires search result link tracking.

Jamie

msgraph

1:34 pm on Mar 4, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like the 3rd one rubble88 listed above, Methods and apparatus for using a modified index to provide search results in response to an ambiguous search query [patft.uspto.gov] , was just issued today.