homepage Welcome to WebmasterWorld Guest from 54.198.42.105
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google "behind the scenes" video/slideshow
engineer Jeff Dean lectures at Uni. of Washington
amznVibe




msg:725415
 5:24 pm on Mar 23, 2005 (gmt 0)

This is a facinating video with some details I have never heard elsewhere:
[uwtv.org...]
Produced by:University of Washington, October 21, 2004
Runtime:00:55:36
Google: A Behind-the-Scenes Look
In this program, Jeff Dean of Google describes some of these challenges, discusses applications Google has developed, and highlights systems they've built, including GFS, a large-scale distributed file system, and MapReduce, a library for automatic parallelization and distribution of large-scale computation. He also shares some interesting observations derived from Google's web data.

 

Brett_Tabke




msg:725416
 6:31 pm on Mar 23, 2005 (gmt 0)

most excellent - thanks.

amznVibe




msg:725417
 6:39 pm on Mar 23, 2005 (gmt 0)

Also check out their oldie but goodie from 2002:
[researchchannel.org...]
Google's Linux cluster currently processes over 150 million queries a day, searching a multi-terabyte web index for every query with an average response time of less than a quarter of a second, with near-100% uptime. In this discussion, Google Fellow Urs Hölzle will describe the software and hardware infrastructure that makes this performance possible, as well as provide an overview of the main problems facing a web search, software architecture, servers and compact rack hardware designs.

For those with massive bandwidth and low latency (warning: my 300k/sec cable isnt even fast enough) you can try using their ultra high quality MPEG2 stream via the IBM "VideoCharger" player which can be found here:
[www-306.ibm.com...]

These videos can be saved permanently using HiDownload, WMRecorder or similar - some of the slides are worthy of much closer study ;)

(for example in this video it is the first time I have heard of google "shards" [google.com] but maybe I just haven't been paying attention?)

Brett_Tabke




msg:725418
 7:28 pm on Mar 23, 2005 (gmt 0)

Ya, shards are not new really.

This is fun too ;-)
[google.com...]

Sharper




msg:725419
 8:04 pm on Mar 23, 2005 (gmt 0)

Just watching the video and one of his explanations caught my ear. The Google engineer confirms that PageRank is query independent, meaning that the Pagerank contribution to sorting search results doesn't take into account what terms are being searched for.

This would seem to confirm that for ranking purposes related to PR (not anchor text or other criteria), the theme of the crosslinked sites is irrelevent.

That bit is about 12-13 minutes into the show.

.... back to the video ....

itisgene




msg:725420
 8:20 pm on Mar 23, 2005 (gmt 0)

Very interesting video.
I liked a part where he talks about query clustering. You may say it is related to "thesaurus" or even "LSI". But giving ranking boost based on high cluster points was the most interesting part.

donpps




msg:725421
 8:45 pm on Mar 23, 2005 (gmt 0)

Help guys

>> I am unable to view video. Tried IBM Charger thingy ..

Looks like server crashed. Can some one sticky me the video ... if they have it?

Thanks

Don

donpps




msg:725422
 9:43 pm on Mar 23, 2005 (gmt 0)

I guess my frustration lead me to find archived or downloadable copies of this presentation.

Here goes : [norfolk.cs.washington.edu...]

Hope the "powers" don't edit out the URL

Don

Brett_Tabke




msg:725423
 10:36 pm on Mar 23, 2005 (gmt 0)

thanks.

pontifex




msg:725424
 11:19 pm on Mar 23, 2005 (gmt 0)

what I consider interesting:

WHY are they saving the higher PR shards more often? That confuses me in terms of relevancy...

If I save the high PR shards more often and the lower PR shards less often, everything comes down to PR, which is simply not the case.

Keyword in title, incoming named links, etc. are surly outweighting PR very often, why not saving "often searched keyword" shards more often? Or am I thinking too SEO for that?

Or is the keyword density of an "often searched keyword" in a document influencing it's PR? Surely not in the original formular!

Cheers,
Puzzler

Teshka




msg:725425
 11:28 pm on Mar 23, 2005 (gmt 0)

Hehe, since I live in Seattle, I've seen that on the UWTV public access channel a couple times now. Slightly more interesting to me than the calculus lectures. Nice of them to put it up on the web ;)

pontifex




msg:725426
 11:33 pm on Mar 23, 2005 (gmt 0)

hehehe, I should first watch to the end! now with the clusters, it makes much more sense :-)

GoogleGuy




msg:725427
 12:11 am on Mar 24, 2005 (gmt 0)

Yah, that's a pretty good talk. :)

iblaine




msg:725428
 1:14 am on Mar 24, 2005 (gmt 0)

GG - think you guys/girls could release the tool used to view the model of clusters? :)

Sharper




msg:725429
 2:21 am on Mar 24, 2005 (gmt 0)

iblaine,
Shhhh....... check out the adwords keyword suggestion tool and you'll find something remarkably similar to the tool you saw in the video, just without the numbers.

I know I shouldn't give ALL the secrets away, but sometimes I can't resist. :)

Hanu




msg:725430
 10:28 am on Mar 24, 2005 (gmt 0)

> I know I shouldn't give ALL the secrets away, but sometimes I can't resist. :)

Yeah, you should be Sharper than that! ;-) But thanks anyway, I'll have to watch it at home. Can't do it at the office. Boss isn't deaf, unfortunately.

Oops, suddenly I'm preferred? Good!

Brett_Tabke




msg:725431
 3:09 pm on Mar 24, 2005 (gmt 0)

> That confuses me in terms of relevancy...

But does not in terms of spider frequency.

boredguru




msg:725432
 10:48 pm on Mar 24, 2005 (gmt 0)

I guess my frustration lead me to find archived or downloadable copies of this presentation.

Here goes : [norfolk.cs.washington.edu...]

Thanks Don.

I spent almost an hour searching for a good streaming media downloader for my mac. And i got a wireless dialup which goes max at 144kbs.

Was getting frustrated when i saw you URL. Its currently downloading. Cant wait.

Think its about time i visit the Mac Webmaster forum.

Hanu




msg:725433
 10:25 pm on Mar 25, 2005 (gmt 0)

I liked the joke about the workers ... Didn't seem to work with that kind of audience, though ...

Seriously, one thing makes me wonder. Google pride themselves how they can provide reliable service on unreliable hardware using fault tolerant software. Kudos to them but how do the other SE's do it? Dont't they need a similar kind of infrastructure? Maybe not, considering that they only get a fraction of the traffic Google gets. What would happen if, say, MSN suddenly got a huge increase in traffic? Would they just die?

amznVibe




msg:725434
 2:44 pm on Mar 26, 2005 (gmt 0)

So any other little discoveries from this video?

fischermx




msg:725435
 6:50 pm on Mar 29, 2005 (gmt 0)

Does anybody has the downloadable version of this video?
It is the one from November, 2002.

[uwtv.org...]

It is impossible for me to see it in streaming mode.

amznVibe




msg:725436
 7:11 pm on Mar 29, 2005 (gmt 0)

Here is another Google presentation from the "19th ACM Symposium" (October 2003) [ramp.ucsd.edu] which can actually be downloaded and saved (right click). Here is their little paper from that event: [labs.google.com...]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved