| This 43 message thread spans 2 pages: < < 43 ( 1  ) || |
|Sergey Brin Says Yahoo's Index Size Claim is Inflated|
Yahoo Stands by their Statement of Being 19.2 Billion Strong
| 1:43 am on Aug 15, 2005 (gmt 0)|
No surprise here: "Sergey Brin, Google's co-founder, suggested that the Yahoo index was inflated with duplicate entries in such a way as to cut its effectiveness despite its large size."
but: "On Sunday, researchers at the National Center for Supercomputer Applications attempted to shed light on the debate by performing a large number of random searches on both indices. They ran a random sample of 10,012 queries and concluded that Google, on average, returned 166.9 percent more results than Yahoo. In only three percent of the cases did the Yahoo searches return more queries than Google. The group said the Yahoo index claim was suspicious."
| 6:45 pm on Aug 16, 2005 (gmt 0)|
Part of the problem with Yahoo, is listing sites with multiple domains, multiple times.
If the site has 1,000 pages, and Yahoo lists all 3 domain names pointed to the site, that is 3,000 pages instead of 1,000. I have seen that. In that particular situation, there was only one domain listed in Google, rightfully so. It is a glitch that Yahoo has not worked through yet.
Sergey is correct.
| 8:39 pm on Aug 16, 2005 (gmt 0)|
Did Google downgrade Yahoo's PR to 9 in response?
| 8:44 pm on Aug 16, 2005 (gmt 0)|
Thats kind of funny Walkman, I have seen the same thing on a personal site of mine and on my employers.
My site has some 25 - 30k pages at most and Google has it at 32ish and Yahoo has it at 66000 making me think the real number is a little over 30k pages.
The other site is about the same size or a bit bigger and Google has it at 41000 and Yahoo is actually down from 136000 to 113000 or almost 3 times as many.
If more people are seeing this it would explain why Yahoo thinks it is so big, it cant seem to count correctly.
| 9:04 pm on Aug 16, 2005 (gmt 0)|
IMO, G was trying to "clean" the web and keep it for sites they approve of.
They ended up decreasing their index by kicking many innocent webmasters out.
At the same time Y somehow got a sense of their plan. Instead of them cleaning their index and feeling the wrath of webmasters, they decided to index more.
Right when G tried to kick small publishers out, Y announced its index and hit G right where it hurts most. And now, Gbot is hitting crazily all over the web.
G can say all they want that Y's index claims are inflated. But then same can be said for G, whenever they reported. What good is the index size when the web is limited to someone's useless bookmarks.
Today many people are searching for a G substitute.
| 10:42 pm on Aug 16, 2005 (gmt 0)|
So what happens after the furious gbot activity if G's index comes in at say 13 B? And if G ante's up and over can Y! not be far behind?
All this will equate to is more spam, dupes and thin docs in the respective indexes. I blame G as they painted the big target on their homepage and continue to make a big deal over the 'inflated 8.1B'
You reap what you sow...
| 2:18 am on Aug 17, 2005 (gmt 0)|
I think Sergey might be right. One of my sites for a while had an add on domain and it appeared twice in the Yahoo serps. First as a directory from within my main domain, and secondly as its own domain itself. I love Yahoo though. They have voice chat, and google doesn't.
| 1:03 am on Aug 22, 2005 (gmt 0)|
I decided to test some queries after reading <Jeremy Zawodny's Blog Article, "Of Course Size Matters!">
Not only Yahoo's index is bigger, IMO, it will take Google quite some time to catch up.
Here are some random searches and number of results found.
[Shoping] Y 1,540,000,000 G 318,000,000
[hosting] Yahoo 631,000,000 Google 123,000,000
[network] Yahoo 1,130,000,000 Google 589,000,000
[social] Yahoo 673,000,000 G 307,000,000
[Wedding] Y 251,000,000 G 42,700,000
[antidisestablishmentarianism] Y 80,000 G 36,100
["do no evil"] Y 275,000 G 82,400
[yahoo] Y 869,000,000 G 177,000,000
[google] Y 480,000,000 G 228,000,000
[microsoft] Y 562,000,000 G 258,000,000
[Spam] Y 210,000,000 G 94,500,000
[USPTO] Y 4,300,000 G 1,380,000
["larry page"] Y 735,000 G 175,000
["will smith"] Y 11,500,000 G 1,870,000
[NGCSU] Y 345,000 G 42,200
[UCLA] Y 30,100,000 G 21,900,000
[Stanford] Y 61,300,000 G 73,800,000
["google nightmare"] Y 1,330 G 250
["yahoo nightmare"] Y 132,000 G 6,710
["jeremy zawodny"] Y 1,760,000 G 699,000
[eugooglizer] Y 74 G 40
["george w. bush"] Y 50,200,000 G 18,000,000
["video games"] Y 248,000,000 G
[to be or not to be] Y 4,840,000,000 G 1,370,000,000
[movies] Y 957,000,000 G 152,000,000
[linkshare] Y 1,640,000 G 989,000
[eminem] Y 46,900,000 G 4,880,000
[Toyota] Y 101,000,000 G 12,900,000
[Honda] Y 135,000,000 G 14,400,000
[nissan] Y 80,200,000 G 9,340,000
[BMW] Y 120,000,000 G 13,400,000
[sims] Y 64,500,000 G 9,190,000
["average joe"] Y 3,150,000 G 722,000
[NASDAQ] Y 103,000,000 G 21,800,000
["dow jones"] Y 36,200,000 G 7,770,000
[zaxbys] Y 15,700 G 3,940
[Clinton] Y 113,000,000 G 29,400,000
G had more results only for one term [Stanford] out of the above terms.
Some will obviously continue to argue in favor of G but it's actually quite clear who is bigger.
[edited by: martinibuster at 9:54 pm (utc) on Aug. 23, 2005]
[edit reason] Removed url. [/edit]
| 10:24 pm on Aug 23, 2005 (gmt 0)|
If you dig into what those numbers mean (since you can't see past 1,000 results) you'll hear from search engineers that it is an estimated number - related to how many "objects" are in their database related to the search query. Now exactly what does that mean?
I asked Matt Cutts about this in New Orleans and he said almost exactly that -- the way that search engines chop up the data they retrieve it becomes very difficult to end up with a hard number, and the number you see as a total is a bit nebulous. As long as the number is obtained through essentially the same process, it has a comparative value - one query to another on the same SE. But no real hard value can be assumed. A lot like your server logs, you know?
| 2:30 am on Aug 24, 2005 (gmt 0)|
My personal experience is that the number of pages I have in Y! has grown in the last 3 months, and the number in G has shrunk to almost zero.
That accounts for only 6 Million pages, but I see no evidence to the contrary of Y!'s claim. The question of quality remains, however.
| 5:44 pm on Aug 24, 2005 (gmt 0)|
On our personal sites the number of pages we have in Yahoo has grown, number in Google has stayed the same or moved up or down. At one time 3 month ago Google was close to Yahoo but that lasted only a few days!
Yahoo Results for [widgets] 1234 1 - 10 of about 2,430,000 for
Google Results for [widgets] 1234 1 - 10 of about 266,000
Yahoo Results [widgets] 5678 1 - 10 of about 833,000 for
Google Results [widgets] 5678 1 - 10 of about 199,000
[edited by: jcoronella at 7:18 pm (utc) on Aug. 24, 2005]
| 6:23 pm on Aug 24, 2005 (gmt 0)|
I think Yahoo is cheating. They have a lot of entries where they store only the title and url of pages and are not "stored" in the main index similar to Google's supplemental index. This separate index is searched only if the main search returns "few" results.
I did the following:
search [site:mydomain] returns 138K+ results only 5 have any description and with cache, the rest have only the title (all lower case) and no cache.
search [site:mydomain unique_keyword] returns 5 results, exactly the 5 entries in the above query. all pages contain the "unique_keyword".
so it seems Y is claiming credit for all the 138k+ pages whereas it only includes the 5 pages that are "fully" indexed for its main searches.
search [intitle:"page_title"] - no page found although the first query showed the page in the serps!
Google, looks like Yahoo has learned your tricks!
| 6:23 pm on Aug 24, 2005 (gmt 0)|
Please try the above searches and see if you get the same results.
| 9:22 pm on Aug 25, 2005 (gmt 0)|
I noticed that for some rarer searches on Yahoo, they will estimate say 100 results, but once you start clicking through there may only be about 30. I wouldn't trust the "out of 10,000,000" numbers for Google or Yahoo.
| This 43 message thread spans 2 pages: < < 43 ( 1  ) |