Longest query time contest

Forum Moderators: open

Message Too Old, No Replies

Longest query time contest

What types of queries result in the longest query time?

Slud

8:40 pm on Aug 26, 2002 (gmt 0)

Has anyone run across a particular type of query (e.g. exact string matching, boolean, etc.) that particularly taxes google? I don't think "Search took" time over 0.4 seconds.

I'm guessing that poor search times are non-repeatable (since common results must get cached), but it'd be interesting to hear what really gets those pigeons pecking.

GoogleGuy

10:01 pm on Aug 26, 2002 (gmt 0)

Hi, be aware that network latency can add time to a search result too. On a modem, the load time of a page can easily take the most amount of time. Overall, I wouldn't encourage people to see how much load they can send to Google though. ;)

dukeblue219

11:40 pm on Aug 26, 2002 (gmt 0)

I assume he was talking about the little number provided on the results pages, which shouldn't have anything to do with load time, right? I think those numbers tend to vary a lot though anyway, so it's not too feasible to measure them.

I would, however, be curious about what type of searches take longest for the servers to process, but I don't think the number on the SERPS really reflects that too accurately.

Slud

3:53 am on Aug 27, 2002 (gmt 0)

Not intending to launch a DDoS or anything. :) Just curious to see how Google maintains sub-second response times on both very common and very obscure queries.

Brett_Tabke

4:47 am on Aug 27, 2002 (gmt 0)

I was just wondering this the other day (probably during the update) when I hit 3 seconds on a search.

If you go into the image search, you can hit 3-4 seconds regularly during peak hours. I justed searched for "widgets.gif [images.google.com]" in the image search with filtering off and it was 12 seconds to tell me it didn't find anything (repeated several times with near same results and a minute later it was instantaeous).

Complicated Usenet searches can also take 4-5 during the day too.

Markus

8:23 am on Aug 27, 2002 (gmt 0)

Link command queries for domains like yahoo.com often take more than half a second because lots of data has to be processed.

PaulPaul

9:34 am on Aug 27, 2002 (gmt 0)

Google discusses their technology openly, and has been an excellent example to the enterprise computing community.

They use clusters of index servers, to distribute the query across clusters of smaller computers containing the actual data.

With over 2.8 billion WebPages in its DB, most queries are returned in under a second, which even with unlimited resources, is am extremely impressive achievement. And has played a major role in Google�s success.

Brett_Tabke

6:19 am on Aug 29, 2002 (gmt 0)

hehe, It's always good to hear from the Google tech team PaulPaul :) Is he one of yours GG? That sounded suspicously like ad copy. hehe

Clusters are not all that impressive any more - fairly routine tech this days. The glue that keeps the whole thing together is interesting though. Just once, I'd like to hear a tech give us a "tell all" about the whole system. We've heard the macro picture from the users perspective end at the site, and we've heard some minutia about data centers and actual hardware, but we've never heard a good overview of the whole system and how it glues together.

If we get down to the box, the part I'd like to hear more about is on the mechnics of the data storage and retrevial end. Google has talked in one interview about their proprietary full disk spanning file system (random/relative access). I believe it was with Larry or might have been Craig. That interview is no longer on the web (or is it?). That would be fascinating to hear more about.

sorry to hijack your thread slud

GoogleGuy

6:37 am on Aug 29, 2002 (gmt 0)

Hey Brett, I don't know PaulPaul, but I like him already. :) To the best of my knowledge, I think I'm the only Googler that posts to WebmasterWorld.

About clusters: there are a few good talks floating around the web about architectures for distributed computing. Of course, it never hurts to have 10,000 computers to run your code. :)

PaulPaul

8:43 am on Aug 29, 2002 (gmt 0)

LOL,

I don�t work for Google, I am just a big fan of technology.

Also, having gone to the rival University of the founders of Google, makes me even more curious :)

I actually just left the grind of corporate America, and if I make any $$, will be extreamly happy. I worked for a multi-billion $$ financial institution, and was sick of my dense, fancy-degreed bosses :). I am now using the internet as a medium for a new business.

Back to clustering, clustering has been around since the olden days of computing, but IMHO the last couple of years the performance of these computing clusters has gone through the roof.

Having also worked for a major online corp, with over 20 million unique records, containg text fields. I know how challenging deleivering Sub-Second responses to full-text queries is, but I also know it is an absolute necessity. Sites like Google, along with WW, and all the major players on the net, have set the bar.

And IMO, any site wishing to be a major player, must return sub-second responses.

Thanks for kind responses guys,

Paul

mack

10:13 am on Aug 29, 2002 (gmt 0)

Hey Google guy. What I find is that if I hack my isp and set up a bot to automaticaly query google 1000 times a second that tends to slow it down :)

PaulPaul

4:22 am on Sep 2, 2002 (gmt 0)

I found a nice article on Google technology, along with an excellent Q&A session with Dr. Reese of Google on mp3. Thought it was relevent to our discussion.

[searchenginewatch.com...]

Paul