homepage Welcome to WebmasterWorld Guest from 54.242.200.172
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
Forum Library, Charter, Moderators: martinibuster

Yahoo Search Engine and Directory Forum

This 43 message thread spans 2 pages: 43 ( [1] 2 > >     
Sergey Brin Says Yahoo's Index Size Claim is Inflated
Yahoo Stands by their Statement of Being 19.2 Billion Strong
walkman




msg:833055
 1:43 am on Aug 15, 2005 (gmt 0)

No surprise here: "Sergey Brin, Google's co-founder, suggested that the Yahoo index was inflated with duplicate entries in such a way as to cut its effectiveness despite its large size."

but: "On Sunday, researchers at the National Center for Supercomputer Applications attempted to shed light on the debate by performing a large number of random searches on both indices. They ran a random sample of 10,012 queries and concluded that Google, on average, returned 166.9 percent more results than Yahoo. In only three percent of the cases did the Yahoo searches return more queries than Google. The group said the Yahoo index claim was suspicious."

[nytimes.com...]

 

Import Export




msg:833056
 6:21 am on Aug 15, 2005 (gmt 0)


Ouch...

zeus




msg:833057
 11:30 am on Aug 15, 2005 (gmt 0)

Hmm I think Sergey Brin should take a look at there own serps, before saying that, google indexes old cache from mid 2004, domain20%.com and a lot of 404 pages are listed.

Try site:f in google or any other letter and you will find a lot of 404 listed (a little tip for yahoos responce)

twebdonny




msg:833058
 1:00 pm on Aug 15, 2005 (gmt 0)

Inflated is correct, Just look at Yahoo's
own travel pages and see how many dupes are
there.

trillianjedi




msg:833059
 1:13 pm on Aug 15, 2005 (gmt 0)

Handbags at the ready, gentlemen.

econman




msg:833060
 1:14 pm on Aug 15, 2005 (gmt 0)

I'm not sure a count of the "results" provided in response to a query is a valid indicator of index size or scope, since there are other factors involved in decided whether or not a particular document is considered a "match" for a specific query.

The NY Times article includes a quote from someone who has seen indications that, at least with respect to French language queries, Y!'s index has grown substantially, as well as some skepticism that the size claims can be independently verified:

...Other search engine specialists remained skeptical about the ability to estimate Web or index size as long as the search engines were being secretive about their methods. "I don't have any good way of checking,"

Philosopher




msg:833061
 2:15 pm on Aug 15, 2005 (gmt 0)

Yeah, going by what the number of results the engines say they find is not the best measure. G has long stated this is an approximation. In addition, G's initial # is almost always heavily inflated as most times I run a search the initial "results returned" number changes significantly (downward) as I click through into deeper results.

I don't run enough searches on a regular basis to know if this is true at Yahoo as well though.

woop01




msg:833062
 2:21 pm on Aug 15, 2005 (gmt 0)

Some woman needs to tell these guys the size isn't as important as what you do with it.

walkman




msg:833063
 4:05 pm on Aug 15, 2005 (gmt 0)

>> Some woman needs to tell these guys the size isn't as important as what you do with it.

that's what they say not to hurt our feelings ;), but in search engines it's actually true, especially considering how my 1200 page on google shows 25,000+ pages.

How come? Well, the print, save, send, and each outbound link (redirect script) is a "page". I'm afraid to look at Yahoo, I might have a 50,000 page website.

engine




msg:833064
 4:37 pm on Aug 15, 2005 (gmt 0)

The numbers game is exactly that, a game. The most important questions are not about the quantity, but are about the quality of data stored, and the ability to deliver quality serps.

Webwork




msg:833065
 4:47 pm on Aug 15, 2005 (gmt 0)

With all due respect to the intelligence of the NYTimes and its editorial policy, the fact that the kids are big - even corporate titans - still doesn't transform this "news event", IMHO, into a thing worthy of greater punditry or analysis, anywhere - including here.

Forgive my foo-like commentary but, for me, the analysis of this non-event is that it is foo, as in foo-lishness. I think engine's comment is spot on.

mack




msg:833066
 5:09 pm on Aug 15, 2005 (gmt 0)

I can see this as round 1..

1. Your index is overinflated. (G)
2. yea but pagerank is corrupt. (Y!)
3. Your logo's terrible (G)
4. Googlebot takes performance enhancing substances (Y)
5. Your directory is a Dmoz clone (G)
6. Actualy thats you (Y!)
7. umm oh yea.... but youre fat (G)
8. Yea and you dropped out (Y!)

Ok maybee not. We can but wish.

Mack.

martinibuster




msg:833067
 5:11 pm on Aug 15, 2005 (gmt 0)

While there is a foo quality to Sergey getting involved in saying something like this, I think that in the interest of getting to the truth of the matter, it's important to not stoop to the same kind of foo-ness.

It's easy to meow and hiss like a cat and miss the important things going on.

1: Does anyone else think it's a PR move for Sergey to step into this fight?

2: Has anyone else noticed thousands of extra pages indexed lately without any noticeable blip in ranking?

3: If these pages exist, where are they? Forums? Phantoms? non-English pages?

Rollo




msg:833068
 5:20 pm on Aug 15, 2005 (gmt 0)

It's all about good press. If Yahoo is abuzz with good press and can entrench a reputation for being the biggest, it will lead to more Yahoo searches almost certainly. I guess Google decided it needed to kill it before it grows.

I suspect both Y and G are exaggerating as both obviously contain loads of junk and inumerable sites that you'd never be able to find in a search if your life depended on it.

encyclo




msg:833069
 5:24 pm on Aug 15, 2005 (gmt 0)

Whilst it isn't surprising that Sergey said this (all PR spin, of course), but index size does matter to a non-technical audience who equate sheer volume with being more likely to find what they are searching for. When the news stories started a few days back, my wife (who know nothing about search) mentioned that she'd "read that Yahoo is better than Google now".

It's not for nothing that Google, even with their home page being so devoid of anything other than the bare essentials, still find it important to have the tag "Searching 8,168,684,336 web pages".

This has very little to do with facts, and a lot about simple perception.

vordmeister




msg:833070
 6:40 pm on Aug 15, 2005 (gmt 0)

I've just done a site search on one of my single page sites. Yahoo comes up with 2 pages!

One of them is the CSS file. Way to go. I'm going to try adding a robots.txt and maybe a javacript file to see if I can't help Yahoo past the 20 billion barrier.

EDIT> To be fair, I've checked some other sites and Yahoo seem more accurate on page count than the big G.

texasville




msg:833071
 6:56 pm on Aug 15, 2005 (gmt 0)

I think the fact that sergey made a comment means google just blinked...

encyclo




msg:833072
 7:31 pm on Aug 15, 2005 (gmt 0)

From Slashdot [slashdot.org] - A Comparision of the Size of the Yahoo! and Google Indices by the National Center for Supercomputing Applications (NCSA):

[vburton.ncsa.uiuc.edu...]

(...) we found that on average Yahoo! only returns 37.4% of the results that Google does and, in many cases, returns significantly less. As our search results indicate, there are a number of cases in which Google returns dozens of results while Yahoo! only returns one or two results, or none at all.

The one caveat is that the research is based on searches with <1000 results due to the fact that neither SE shows more than 1000 ranked pages.

maherphil




msg:833073
 8:02 pm on Aug 15, 2005 (gmt 0)

is Google out of its mind? First they blow up on a CNET article and now they claim that their's is bigger then Yahoo...its grade-school playground antics!

Us google shareholders are not happy about these emotional outbreaks. Google should get back to business and fix all the problems the users see with the engine.

diamondgrl




msg:833074
 8:24 pm on Aug 15, 2005 (gmt 0)

The CNET article flap was grade-school, but this strikes to the core of their business. And as far as I can tell from my own work, Google is right. And we have an independent study now to suggest that Google was right on target.

walkman




msg:833075
 8:36 pm on Aug 15, 2005 (gmt 0)

>> is Google out of its mind? First they blow up on a CNET article and now they claim that their's is bigger then Yahoo

not to long ago they had the greatest PR team. Did they all get fired, or just can't control their bosses anymore? Now the most popular techie site keeps mentioning on every article how google doesn't talk to them because of a bad article (example [news.com.com])--and privacy worries get more play--and now Sergei gets caught in childish comments.

Google could've just used one of it's VPs or engineers to say the same thing, no need to get the top guy involved. You don't see the POTUS denying, or commenting on every story. That's why you have aides, employees and anon comments.

Has anyone ever read a comment where GoogleGuy got in this little wars, despite many here saying Yahoo this and MSN that? Nope!

[edited by: martinibuster at 12:15 am (utc) on Aug. 16, 2005]
[edit reason] Fixed url [/edit]

kevinpate




msg:833076
 8:58 pm on Aug 15, 2005 (gmt 0)

arrrrgh,
tis truly pointless landlubber speak for the captians to fire shots across the bow, fret over the depth of the ocean, the size of yer ship or the height of their respective masts. True Sea Dawgs would spend more time and be harder at work to be shed of the constant attack of them barnacles clinging at the hull.
arrrrrgh

Eva_Geddes




msg:833077
 9:08 pm on Aug 15, 2005 (gmt 0)

Why would Brin say something that is so obviously aimed at provoking them? What's his deal?

Further, do you think Eric Schmidt consulted anyone before opening his big mouth about CNET? That entire situation... makes everyone look bad. Guess he doesn't care, provided they don't take his airplane away.

Has Google jumped the shark, so to speak? Pfft.

TomWaits




msg:833078
 9:40 pm on Aug 15, 2005 (gmt 0)

Has anyone ever read a comment where GoogleGuy got in this little wars, despite many here saying Yahoo this and MSN that? Nope!

You have to be kidding me.

alexweb




msg:833079
 9:54 pm on Aug 15, 2005 (gmt 0)

They each only show the first thousand results, so what's the big deal. As for Google, what's the point of indexing sites, if they don't rank them(sandbox). Bragging rights I guess.

stuntdubl




msg:833080
 10:24 pm on Aug 15, 2005 (gmt 0)

I suspect both Y and G are exaggerating as both obviously contain loads of junk

I heard a joke once at a conference that about 10 webmasters account for 3/4 of the index.

Anyone see the Pam Anderson Roast?

I think Tommy Lee is a liar too.

Put 'em back in your pants guys.

twist




msg:833081
 1:23 am on Aug 16, 2005 (gmt 0)

I think Tommy Lee is a liar too.

(I don't know what roast your talking about but after watching a certain video, tommy lee would shame both google and yahoo combined.)

As for the index size, Yahoo has ~1,100 of my non-duplicate pages indexed, MSN ~800, and Google 17. So as far as I can tell Yahoo has a 64x larger index than Google in my little corner of the world. No css either.

b0rdslide




msg:833082
 9:11 am on Aug 16, 2005 (gmt 0)

Using the site: searches it definately seems as though yahoo's index is the larger when I compare one of my own sites...

Yahoo: 319,000 pages
Google: 7,480 pages
MSN: 336 pages

Saying that I have seen 3x more spider activity from google this month compared to the inktomi spider. If it continues at the same pace then google would have spidered almost 20x the average monthly amount. Maybe they're gearing up for their own increased index?

subway




msg:833083
 10:21 am on Aug 16, 2005 (gmt 0)

Some woman needs to tell these guys the size isn't as important as what you do with it

That's true, it seems that everyone at the "plex" is suffereing from "index envy". When did the battle switch to quantity and not quality. Who's interested in how many bull**** nonsense pages are indexed anyway.

AhmedF




msg:833084
 11:23 am on Aug 16, 2005 (gmt 0)

especially considering how my 1200 page on google shows 25,000+ pages.

It is amazing how inflated. I had a 300,000 page website that was upto 2.6 million pages in google. Its down to roughly 1 million now. Yahoo shows a more accurate 250,000.

This 43 message thread spans 2 pages: 43 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Yahoo / Yahoo Search Engine and Directory
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved