Welcome to WebmasterWorld Guest from 54.227.101.214

Message Too Old, No Replies

Google "behind the scenes" video/slideshow

engineer Jeff Dean lectures at Uni. of Washington

     

amznVibe

5:24 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is a facinating video with some details I have never heard elsewhere:
[uwtv.org...]
Produced by:University of Washington, October 21, 2004
Runtime:00:55:36
Google: A Behind-the-Scenes Look
In this program, Jeff Dean of Google describes some of these challenges, discusses applications Google has developed, and highlights systems they've built, including GFS, a large-scale distributed file system, and MapReduce, a library for automatic parallelization and distribution of large-scale computation. He also shares some interesting observations derived from Google's web data.

Brett_Tabke

6:31 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



most excellent - thanks.

amznVibe

6:39 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also check out their oldie but goodie from 2002:
[researchchannel.org...]
Google's Linux cluster currently processes over 150 million queries a day, searching a multi-terabyte web index for every query with an average response time of less than a quarter of a second, with near-100% uptime. In this discussion, Google Fellow Urs Hölzle will describe the software and hardware infrastructure that makes this performance possible, as well as provide an overview of the main problems facing a web search, software architecture, servers and compact rack hardware designs.

For those with massive bandwidth and low latency (warning: my 300k/sec cable isnt even fast enough) you can try using their ultra high quality MPEG2 stream via the IBM "VideoCharger" player which can be found here:
[www-306.ibm.com...]

These videos can be saved permanently using HiDownload, WMRecorder or similar - some of the slides are worthy of much closer study ;)

(for example in this video it is the first time I have heard of google "shards" [google.com] but maybe I just haven't been paying attention?)

Brett_Tabke

7:28 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Ya, shards are not new really.

This is fun too ;-)
[google.com...]

Sharper

8:04 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Just watching the video and one of his explanations caught my ear. The Google engineer confirms that PageRank is query independent, meaning that the Pagerank contribution to sorting search results doesn't take into account what terms are being searched for.

This would seem to confirm that for ranking purposes related to PR (not anchor text or other criteria), the theme of the crosslinked sites is irrelevent.

That bit is about 12-13 minutes into the show.

.... back to the video ....

itisgene

8:20 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Very interesting video.
I liked a part where he talks about query clustering. You may say it is related to "thesaurus" or even "LSI". But giving ranking boost based on high cluster points was the most interesting part.

donpps

8:45 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Help guys

>> I am unable to view video. Tried IBM Charger thingy ..

Looks like server crashed. Can some one sticky me the video ... if they have it?

Thanks

Don

donpps

9:43 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



I guess my frustration lead me to find archived or downloadable copies of this presentation.

Here goes : [norfolk.cs.washington.edu...]

Hope the "powers" don't edit out the URL

Don

Brett_Tabke

10:36 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



thanks.

pontifex

11:19 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



what I consider interesting:

WHY are they saving the higher PR shards more often? That confuses me in terms of relevancy...

If I save the high PR shards more often and the lower PR shards less often, everything comes down to PR, which is simply not the case.

Keyword in title, incoming named links, etc. are surly outweighting PR very often, why not saving "often searched keyword" shards more often? Or am I thinking too SEO for that?

Or is the keyword density of an "often searched keyword" in a document influencing it's PR? Surely not in the original formular!

Cheers,
Puzzler

Teshka

11:28 pm on Mar 23, 2005 (gmt 0)

10+ Year Member



Hehe, since I live in Seattle, I've seen that on the UWTV public access channel a couple times now. Slightly more interesting to me than the calculus lectures. Nice of them to put it up on the web ;)

pontifex

11:33 pm on Mar 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hehehe, I should first watch to the end! now with the clusters, it makes much more sense :-)

GoogleGuy

12:11 am on Mar 24, 2005 (gmt 0)

WebmasterWorld Senior Member googleguy is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yah, that's a pretty good talk. :)

iblaine

1:14 am on Mar 24, 2005 (gmt 0)

10+ Year Member



GG - think you guys/girls could release the tool used to view the model of clusters? :)

Sharper

2:21 am on Mar 24, 2005 (gmt 0)

10+ Year Member



iblaine,
Shhhh....... check out the adwords keyword suggestion tool and you'll find something remarkably similar to the tool you saw in the video, just without the numbers.

I know I shouldn't give ALL the secrets away, but sometimes I can't resist. :)

Hanu

10:28 am on Mar 24, 2005 (gmt 0)

10+ Year Member



> I know I shouldn't give ALL the secrets away, but sometimes I can't resist. :)

Yeah, you should be Sharper than that! ;-) But thanks anyway, I'll have to watch it at home. Can't do it at the office. Boss isn't deaf, unfortunately.

Oops, suddenly I'm preferred? Good!

Brett_Tabke

3:09 pm on Mar 24, 2005 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



> That confuses me in terms of relevancy...

But does not in terms of spider frequency.

boredguru

10:48 pm on Mar 24, 2005 (gmt 0)

10+ Year Member



I guess my frustration lead me to find archived or downloadable copies of this presentation.

Here goes : [norfolk.cs.washington.edu...]

Thanks Don.

I spent almost an hour searching for a good streaming media downloader for my mac. And i got a wireless dialup which goes max at 144kbs.

Was getting frustrated when i saw you URL. Its currently downloading. Cant wait.

Think its about time i visit the Mac Webmaster forum.

Hanu

10:25 pm on Mar 25, 2005 (gmt 0)

10+ Year Member



I liked the joke about the workers ... Didn't seem to work with that kind of audience, though ...

Seriously, one thing makes me wonder. Google pride themselves how they can provide reliable service on unreliable hardware using fault tolerant software. Kudos to them but how do the other SE's do it? Dont't they need a similar kind of infrastructure? Maybe not, considering that they only get a fraction of the traffic Google gets. What would happen if, say, MSN suddenly got a huge increase in traffic? Would they just die?

amznVibe

2:44 pm on Mar 26, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



So any other little discoveries from this video?

fischermx

6:50 pm on Mar 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Does anybody has the downloadable version of this video?
It is the one from November, 2002.

[uwtv.org...]

It is impossible for me to see it in streaming mode.

amznVibe

7:11 pm on Mar 29, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here is another Google presentation from the "19th ACM Symposium" (October 2003) [ramp.ucsd.edu] which can actually be downloaded and saved (right click). Here is their little paper from that event: [labs.google.com...]
 

Featured Threads

Hot Threads This Week

Hot Threads This Month