Good find Markus...I hadn't seen that.
Great find Markus, As a system admin its very intersting!
If you like that paper, you might also enjoy
It's a partial list of different papers that Googlers have published. There's enough technical papers there to overload most SEOs, but that page is like honey to pull in great engineers. :)
It's like what I was saying in another post [webmasterworld.com], Google is more than a search engine, it's also a web site.
that is a really interesting article to help understand what was the traditional path of a query request to Google, and might still be.
The question I have always had, and if someone could shed some light: Up until now, were the various data center index servers all querying the same document servers?
Hey thanks Googleguy, have I been sleeping or is this a new honey pot?
Wait a second, we're being had!
This forum has high PageRank, right? And we all *know* that if you get a link from a related site - that is high in PageRank - it will benefit you in the SERP's, right?
GoogleGuy, I never thought I would see you doing SEO - and in public!
Do you feel bad now or what? If I see that page riding high in the SERP's for 'money keywords' what will happen if I spam report you? lol.
|webmasterworld nick: jeremy_goodrich |
subject: GoogleGuy link dropping
message: I saw him do it - right out in the open!
trying to inflate the PR of Google.com, as if it's
not high enough!
Expect the action on the SERP's to be found shortly, lol.
Thanks for the link seriously. :) One more thing on the evening 'to do' list.
WOW GoogleGuy ... That list is awsome!.
Now I am dreaming about finding a paper breaking down the list of those 100 variables :)
On a more serious note this massive parallel computing technique using commodity intel boxes can be very well extended beyond web environment in other computing intensive but stateless fields like Gnome mapping , crash test simulation ( auto companies use expensive cray supercomputers for this) , SETI like projects etc etc...
But its very unsuitable for big database applications like finance/payroll ( this is a killer money making area where SUN/HP/IBM servers rule! )
Thanks for the paper Markus. That'll make for some good reading.
And great new feature Googleguy, thanks!
Looks like there are good bits of info in there that were missed.
Just be sure to pass on to the guys who maintain that page to keep it updated frequently. :)
Too funny, jeremy goodrich. :) vitaplease, it's part of a relatively new honeypot, but for engineers. I'm surprised that WebmasterWorld folks didn't find it already--makes me think people are paying too much attention to SJ. ;)
The part that I personally like the most is this: this page has been up for a little while. At the same time, some article quoted another SE rep saying "Google never publishes any papers now; they squirrel away their knowledge" or something like that. The juxtaposition was a little humorous to me, at least, esp. given this page and the details in the IEEE paper. I'm not aware of any other search engines publishing papers like that lately. :)
Oh well. People knock on Google sometimes. If you just keep doing what you know is right, things seem to work out just fine. :)
Thanks GG. I have some engineer friends from your world that will definitely love this info. I've told one of them who would be a great match for Google about you hiring engineers in NY, but he already made some serious dough off an IPO so I doubt you guys would offer him anything he'd find interesting (unless it's consulting work). He's too busy enjoying his boat ;)
As for knocking Google, rest assured that if you achieve success, people will knock you down or try. Take it as a compliment. No sense letting it bother you. If it *doesn't* happen then it means you're nobody or you're doing something wrong :)
P.S. Have you ever heard of a database called "R"? I always wondered if Google ever used tools like that which are great for processing batch data like PR should be, or if it was 100% custom made.
I was once looking to try out R for a project and searched Google (a couple of years back) but couldn't find it! I had a team of people search for it and finally found it. But while speaking I just did a search on "r" (not even adding the word database) and it was the first SERP! So I guess you folks have made some progress in the intervening time! I mean, how much harder can it be to find a page than using one letter search?
That's a pretty hard search. :) I would prod your engineer friend to apply. We've gotten some top-notch engineers lately. I just found out today that we hired a really good person that I was rooting for. :)
Finally read it Markus, very nice read.
Related past article on not going for the fastest chip: Forget Moore's Law [redherring.com]
I am an Engineer by Profession , not a Masters but Bachelors.
Would love to one day work in Google. :)
Sorry moderator if I crossed TOS.
After all Google is the best known corporate in the whole world.
I doubt anyone can match your popularity worldwide.
What interested me about the papers section is the Genetic Algorithm and Artificial Intelligence uproach.It's truly remarkable that google has somebody who did some research in this area.This approach coupled with Evolutionary Algorithms becomes the next century science called Complexity..my favourite area. To Learn more about complexity visit www.santafe.edu, the institute opened by Noble Laurates.
My point is that google has an outstanding resource pool of engineers going by the papers alone.
I am really happy that google such wide variety of people at their resources.
according to Google's SEO rules each webmaster should keep the number of links under 100 on each page. This page has over 200...
It seems like they not only are looking for great engineers but also can use a new webmaster/SEO specialist as well ;)
anyway, great resource, this must definitively keeps everybody out of the current whining-and-exiting threads about the movement on SJ and other datacenters...
>>according to Google's SEO rules each webmaster should keep the number of links under 100 on each page. This page has over 200...
Matt Cutts mentioned at Pubcon that should probably better have been 101 kb.
Hmmm yummy, here goes my sleeping and eating time I'm afraid.
I've seen quite a few of these papers already from the stanford repositories though. The also publish a lot of Google related things.
Besides John Koza and more recent papers based on his work, Google related publishings are my favourite fodder :)
Now I can compare my own search engien and database engines to see how clos I got to the google system. I'm a big fan of clusterign and am still dreaming of the day I can buy a few docent old PCs to try out my clustered GP algos :)
What do you do with the previous generation PCs? give em all to unis? Where can I apply for a "hardware grant" hehe...
Seriously, Working at google is a bit like dying and going to heaven... Nothing left after that, after all it seems the place to develop and make real ideas.
Keep up the good work, I'll get beck to you when I'm done reading the papers ;)
John Koza and GPs Algo :).
If john Holland invented Genetic Algorithm,then John Koza brought it to life at stanford.
That's the match I was looking for a long time mate.
>How close to Google System
Same here,I am trying at a smaller scale though.
Do report here abt your computation time and results.
>>>Seriously, Working at google is a bit like dying and going to heaven... Nothing left after that, after all it seems the place to develop and make real ideas.
Very Well Said KillRoy. :)
I tried running some of these articles through Google's translation service, and still couldn't understand them. It sure would be handy if it could translate engineerian into plain English. ;)
Hmm yuuuummy :) I already got a gazillion ideas. I bet you can built a kick ass automatic text categorizer using genetic programming, much more accurately then the bayesian networks... hmm hmm hmm... now if I JUST had that mini cluster of a dozend or so PCs...
damn you Google, another sleepless night...
typo typo in edit[/edit][/edit]
Hmm added word frequency analysis to my search index... thinking of adding statistical phrases, as in mitra97, but the payoff seems to be relatively small.
Thanks again for thiss well spring of information.
I was trying to get Latent semantic Indexing using Genetic Algorithm!,
Couldn't go beyond a certain time consuming process.But The only hitch with GP's seem to be TIME!
>>I bet you can built a kick ass automatic text categorizer using genetic programming, much more accurately then the bayesian networks.
Perhaps the Best search Engine that evolues on it's own with the web, can be built using GA.That's like Adaptive Search Engine for me.
And what better way to built it then with distributed systems. I've run my own tests with multithreaded colonies of populations and immigration caches. This can be very coarse grained, and loosly synchronised (or not at all) leadign to near 100% efficiency and near zero network lag.