Forum Moderators: open
but: "On Sunday, researchers at the National Center for Supercomputer Applications attempted to shed light on the debate by performing a large number of random searches on both indices. They ran a random sample of 10,012 queries and concluded that Google, on average, returned 166.9 percent more results than Yahoo. In only three percent of the cases did the Yahoo searches return more queries than Google. The group said the Yahoo index claim was suspicious."
[nytimes.com...]
If the site has 1,000 pages, and Yahoo lists all 3 domain names pointed to the site, that is 3,000 pages instead of 1,000. I have seen that. In that particular situation, there was only one domain listed in Google, rightfully so. It is a glitch that Yahoo has not worked through yet.
Sergey is correct.
My site has some 25 - 30k pages at most and Google has it at 32ish and Yahoo has it at 66000 making me think the real number is a little over 30k pages.
The other site is about the same size or a bit bigger and Google has it at 41000 and Yahoo is actually down from 136000 to 113000 or almost 3 times as many.
If more people are seeing this it would explain why Yahoo thinks it is so big, it cant seem to count correctly.
They ended up decreasing their index by kicking many innocent webmasters out.
At the same time Y somehow got a sense of their plan. Instead of them cleaning their index and feeling the wrath of webmasters, they decided to index more.
Right when G tried to kick small publishers out, Y announced its index and hit G right where it hurts most. And now, Gbot is hitting crazily all over the web.
G can say all they want that Y's index claims are inflated. But then same can be said for G, whenever they reported. What good is the index size when the web is limited to someone's useless bookmarks.
Today many people are searching for a G substitute.
All this will equate to is more spam, dupes and thin docs in the respective indexes. I blame G as they painted the big target on their homepage and continue to make a big deal over the 'inflated 8.1B'
You reap what you sow...
Not only Yahoo's index is bigger, IMO, it will take Google quite some time to catch up.
Here are some random searches and number of results found.
[Shoping] Y 1,540,000,000 G 318,000,000
[hosting] Yahoo 631,000,000 Google 123,000,000
[network] Yahoo 1,130,000,000 Google 589,000,000
[social] Yahoo 673,000,000 G 307,000,000
[Wedding] Y 251,000,000 G 42,700,000
[antidisestablishmentarianism] Y 80,000 G 36,100
["do no evil"] Y 275,000 G 82,400
[yahoo] Y 869,000,000 G 177,000,000
[google] Y 480,000,000 G 228,000,000
[microsoft] Y 562,000,000 G 258,000,000
[Spam] Y 210,000,000 G 94,500,000
[USPTO] Y 4,300,000 G 1,380,000
["larry page"] Y 735,000 G 175,000
["will smith"] Y 11,500,000 G 1,870,000
[NGCSU] Y 345,000 G 42,200
[UCLA] Y 30,100,000 G 21,900,000
[Stanford] Y 61,300,000 G 73,800,000
["google nightmare"] Y 1,330 G 250
["yahoo nightmare"] Y 132,000 G 6,710
["jeremy zawodny"] Y 1,760,000 G 699,000
[eugooglizer] Y 74 G 40
["george w. bush"] Y 50,200,000 G 18,000,000
["video games"] Y 248,000,000 G
[to be or not to be] Y 4,840,000,000 G 1,370,000,000
[movies] Y 957,000,000 G 152,000,000
[linkshare] Y 1,640,000 G 989,000
[eminem] Y 46,900,000 G 4,880,000
[Toyota] Y 101,000,000 G 12,900,000
[Honda] Y 135,000,000 G 14,400,000
[nissan] Y 80,200,000 G 9,340,000
[BMW] Y 120,000,000 G 13,400,000
[sims] Y 64,500,000 G 9,190,000
["average joe"] Y 3,150,000 G 722,000
[NASDAQ] Y 103,000,000 G 21,800,000
["dow jones"] Y 36,200,000 G 7,770,000
[zaxbys] Y 15,700 G 3,940
[Clinton] Y 113,000,000 G 29,400,000
G had more results only for one term [Stanford] out of the above terms.
Some will obviously continue to argue in favor of G but it's actually quite clear who is bigger.
[edited by: martinibuster at 9:54 pm (utc) on Aug. 23, 2005]
[edit reason] Removed url. [/edit]
I asked Matt Cutts about this in New Orleans and he said almost exactly that -- the way that search engines chop up the data they retrieve it becomes very difficult to end up with a hard number, and the number you see as a total is a bit nebulous. As long as the number is obtained through essentially the same process, it has a comparative value - one query to another on the same SE. But no real hard value can be assumed. A lot like your server logs, you know?
Yahoo Results for [widgets] 1234 1 - 10 of about 2,430,000 for
Google Results for [widgets] 1234 1 - 10 of about 266,000
Yahoo Results [widgets] 5678 1 - 10 of about 833,000 for
Google Results [widgets] 5678 1 - 10 of about 199,000
[edited by: jcoronella at 7:18 pm (utc) on Aug. 24, 2005]
I did the following:
search [site:mydomain] returns 138K+ results only 5 have any description and with cache, the rest have only the title (all lower case) and no cache.
search [site:mydomain unique_keyword] returns 5 results, exactly the 5 entries in the above query. all pages contain the "unique_keyword".
so it seems Y is claiming credit for all the 138k+ pages whereas it only includes the 5 pages that are "fully" indexed for its main searches.
search [intitle:"page_title"] - no page found although the first query showed the page in the serps!
Google, looks like Yahoo has learned your tricks!