|8 Billion items, what about relevance?|
Does the size of googles index increase or reduce relevancy?
I've been concerned about the quality of results delivered by Google for some months, now it seems that are happy to add 4 billion pages to an index that was already stuffed with irrelevant pages.
Do you agree that the number of pages is irrelevant and if so would you welcome a site which trumps the 8 billion by several degrees of magnitude simply to stop the "we've got the biggest index"?
It's just a thought from yesterday that has been bugging me, if enough people think that trumping G's 8 billion is worthwhile I will spend the weekend building a couple of sites that can legitimately claim to have more pages listed than Google, it will all be irrelevant pages from one site though, listed on a second site that does a full text search on those pages. (please don't question the technical stuff on it, it will work)
Relevancy has not changed for the big queries in my opinion. However, for those 6-10 word specific queries, the bigger index is certainly welcome relief and a good increase in relevancy.
In the past, those queries would return zero results - now many of them are returning a few results. Those hard to find gems are no coming up!
|Searching 8,058,044,651 web pages |
I don't understand why anyone believes that number?
When I do a site:www.mydomain.com search on Google, it tells me my site has 194 pages. In fact, there are ony 144 pages. Where did the extra 50 pages come from?
If we do the math (and my math skills are pretty poor) ... that's about 35% more pages than those which actually exist!
If we follow the logic and assume that all sites have been attributed with 35% more pages than really exist, then the figure shown above has been falsely inflated by at least 2.8 billion pages ... has it not?
I'm with Liane (well, not really;)),
I 301'd a subdirectory of a mature site back in june and since the change to 8 billion pages many of the 301'd pages are showing up on one site and as a supplemental result on the other.
Besides, google doesn't care if the larger index increases relevancy, only if it increases revenue.
I don't think the issue is cut and dried. Consider two URLs:
On some servers, those are exactly the same page, on other servers, they might be entirely different content, or one might not exist at all. It is tricky to know what is a unique page.
Another example would be database driven URLs vs human readable URLs. Some sites offer both, at least for key pages. For example:
They could be exactly the same page. How is Google or any search engine supposed to know the difference? Can it be assumed that because they were the same at one point, that the data will be the same later on? Should SEs even be expected to compare every page in a domain with each other to identify equalities?
Good point. Why should a search engine concern itself with the quality of its index when it's revenue is inversely proportionate to that quality.
|Should SEs even be expected to compare every page in a domain with each other to identify equalities? |
|Why should a search engine concern itself with the quality of its index when it's revenue is inversely proportionate to that quality. |
Simple: Because declining quality would lead to a drop in both traffic and revenue.
But time and time again we hear that joe surfer does not know when he is getting inferior results. And that he never notices the sites that are missing from the results.
>Where did the extra 50 pages
First, it isn't necc "pages" but rather "urls".
I think RFranzen is right on in suggesting: www.domain.com/foo and domain.com/foo and even aaa.bbb.ccc.ddd ip addresses are unique urls. It is quite easy to see how Google could index 4x the number of urls.
Personally, I don't think G's quality has ever been as good across the board as it is right now. Many of the really spammy sectors are slowly getting cleaned up (aka: things like travel...etc)
I really think a bigger idex pays off for everyone.
aka2: lets start targetting more 6-10 keyword phrases ;)
|I really think a bigger index pays off for everyone. |
Sure it does and I agree that the 4, 5 & 6 keyword phrases are working very well these days. I also agree that Google is less spammy than ever before, though there are spammy sites still plaguing the index.
However, I still don't understand the numbers of URLS reported for my site! I searched for "www.mysite.com" not ".mysite.com". Isn't a search specific to "www.mysite.com" supposed to return only results for "www.mysite.com" and does not include ".mysite.com"?
No matter how you cut it, there are only 144 unique URLS for my site. It is reporting 194 URLS. The number has been falsely inflated.
is questionable at best.
|Searching 8,058,044,651 web pages |
|First, it isn't necc "pages" but rather "urls" |
Perhaps we should tell Google that their semantics are incorrect. They should change this statement to read: Searching 8,058,044,651 URL's ;)
Could you add your homepage to your webmasterworld profile? Then some of us could take a look and maybe figure out what Google is seeing that you don't.
No ... but I sent you a sticky. :)