Forum Moderators: bakedjake
SAN FRANCISCO — In her two years at Google, Anna Patterson helped design and build some of the pillars of the company’s search engine, including its large index of Web pages and some of the formulas it uses for ranking search results.
Skip to next paragraphThe makers of the Cuil search engine say it should provide better results and show them in a more attractive manner.
Now, along with her husband, Tom Costello, and a few other Google alumni, she is trying to upstage her former employer.
On Monday, their company, Cuil, is unveiling a search engine that they promise will be more comprehensive than Google’s and that they hope will give its users more relevant results.
[nytimes.com...]
A search engine being the ultimate high load web application, to launch without the ability to handle load is the ultimate face plant.
My guess is that it's not a of lack of available hardware power either, instead... inefficient code.
Talk about wasting a golden opportunity, I hope they bounce back.
We’ll be back soon...Due to overwhelming interest, our Cuil servers are running a bit hot right now. The search engine is momentarily unavailable as we add more capacity.
Thanks for your patience.
Is running a bit hot code for "crashed"?
[edited by: incrediBILL at 7:16 pm (utc) on July 28, 2008]
Interesting how much attention this is getting here. I can hardly get a response out before someone else posts ahead of me!
This is one of the most vigorously reported web service launches in recent memory, so now it's up to Cuil to very quickly live up to the hype, or die the death of a thousand other startups.
.....................
As much respect as we have for Google, most people also feel that it's not healthy for one SE to so thoroughly dominate
Maybe, maybe not. But is Google's dominant share of certain markets (such as U.S. Web search) so much different than the dominance of Books in Print, Westlaw, or the Red Books of advertising agencies and advertisers in their fields? From a user's perspective, is there a need--or even an advantage--to having yet another choice in search engines? Users had all kinds of choices back in the 1990s (remember Excite? Hotbot? Infoseek? Webcrawler?) but ultimately most of them drifted away to Google.
Then it veers off course and is detonated in a fireball, with the smoke spelling out F.A.I.L.
That's sort of what we have here. I was really hoping that we would have something that would compete with Google, but this puppy ain't it.
Confirmation came in a rather friendly ComputerWorld article that featured a link to Anna Patterson's page in the Cuil Management section of cuil.com.
Page Load Error.
Game over, man.
Then..
No results because of high load...Due to excessive load, our servers didn't return results. Please try your search again.
Search? You want search? You can't handle the SEARCH!
Sorry, bad movie reference.
Anyway, I don't see any Google killer here unless by kill you mean "die laughing".
How much was spent on the hardware?
Google has spent some USD 3 Billion in the last 18 months to 2 years on new datacentres... and for that sort of money has put some 30 to 40 server clusters online (a cluster being some one to two thousand servers), almost doubling their capacity.
Obviously Google has myriad other services, mail, maps, ads, etc running on those servers, so only a small part of Google's infrastructure is involved in search. However any startup investing way less that whatever Google is investing in servers dedicated to search is going to FAIL.
[edited by: g1smd at 7:51 pm (utc) on July 28, 2008]
From a user's perspective, is there a need--or even an advantage--to having yet another choice in search engines?
Four (or even 5) strong search services spreads the risk out, and thus to my way of thinking, is all around better. I'm pulling for Cuil to make a dent but as I've posted previously, it's not enough to just talk the talk -- more importantly, ya' gotta' walk the walk.
..............................
I suspect that, if Cuil has a future, it will be with a major brand like Microsoft that's willing to pay big bucks for some ex-Googlers.
I'm also seeing other people's graphics next to my site, and I'm finding the search awfully slow.
I had no idea Twiceler was related to Cuil. I could have sworn I banned it on some sites. I'm not convinced it's time to dig through my .htaccess files and unban just yet.
-functional but overloaded at launch. But the overload should be expected and will likely quickly pass as the new wears off. In the meanwhile it's being nicely handled by load shedding without bringing down the entire site.
-results are about on par with say Gigablast, usable but not up to today's relevancy standards.
-multiple results from the same site are big problem.
-the spammers have had a field day feeding scrapped cloaked content to Twiceler. Many obviously scrapped snippets link to pages with entirely different bait-and-switch content. A good percentage of these pages are attempting to install viruses.
-misses the trees in favor of the groves. My take on Cuil's algorithm is that results sets are being ranked on the basis of the frequency of occurrence of terms occurring in the set at a significantly higher rate than the general web. For example search for a city name, and a high percentage of pages including the city name will also include the word "hotel", so Cuil decides that pages relevant for the city search should also include hotel.
One of my clients has a business site branded with his name. His business is unique in that it is a small field and at least 90% of his competition in the same activity are larger umbrella businesses. The client is also active in a couple of community activities which generate a large amount of web chatter. Over the past year Twiceler has been the second most active spider on his site.
Searching Cuil for the client's name yields all pages concerning the community activities he is involved in or other community activities which he isn't involved with but for whatever reason his name was mentioned. Searching for his business activity yields all larger umbrella businesses which include his activity. Searching for his name and business combined yields all third party mentions of his business mostly in the umbrella context, and scrapped cloaked bait-and-switch pages many of which tried to install viruses. (I guess the scrapped sites had more in common with each other than his site.) Any of these searches on Google, Yahoo, or Ask returns the client's site first page. (Completely invisible in MSN though.)
Another client is a performer who is known by a one word name and has been rock solid number one for that name on Google, Yahoo, MSN, and Ask for at least the past four years. But search for her name on Cuil and the results are a hodgepodge of sites with her site eventually turning up on page five.
In her art form there are four main sub-disciplines, the mainstream which is what most people would think of first mention, and three which are always named headed up with an adjective. My client is in the mainstream which only ever gets an adjective when it is being compared to the other three. One of the sub-disciplines is a communal form which generates a lot of participatory web content in the form of forums, etc.
Search Cuil for this art form and the results are all relevant to the communal discipline with Explore by Category links to the other two adjective disciplines. The mainstream discipline is invisible except for umbrella sites covering all the sub-disciplines.
Considering these observations and others reported in this thread I think it's obvious that Cuil is determining relevance within results sets by using commonality within the sets to attempt to filter out the irrelevant clutter. The problem is that while filtering out irrelevant results they are also losing the most relevant unique results.
It is disconcerting to see one's registered trademark seemingly randomly associated with unrealted sites.
Relevance in the niche I operate in is poor to moderate at best - across any number of terms.
Call me a person of habit, but I find the results page more than a bit disconcering. If the results are going to be laid out in two or three columns, I would prefer that the rows be aligned. Too busy the way it is right now.