Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Indexation capping in Google

         

kidder

10:40 pm on Dec 6, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I just finished reading an interesting article on googles indexation cap. To make a long story short it suggested Google has a limit it places on the number of URLs it will keep in its main index and potentially return in the search results per domain. I guess we are now looking another ranking factor that may need to be considered with the larger sites "quantity of pages" per domain. You would have to think blogs would be right in the line of fire with all of the content scraping that goes on.

martinibuster

5:44 am on Dec 7, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'm certain this has been going on for a very long time. There used to be a limit to how deep Google would index a site and how often that depended on how many inbound links the site had. Beyond that, Google has as long as I remembered discarded small bits of information like an email address posted on a guestbook, information that showed up in Yahoo. That example was from several years ago and Google took a couple years before catching up to Yahoo. I had a similar search recently that also failed in Google, likely because of a lack of inbounds, but that returned results in Yahoo.

Here's another example. I put up a website last week and threw a single link to it from another related site. A search for the two keywords in it's domain, plus the TLD (keyword1keyword2 TLD) shows nothing in Google except whois type sites. BING shows similar results. Yahoo however ranks it number one, displaying a snippet from my meta tag.

I think the Google Algo's dependence on the amount of links has traditionally prevented Google from displaying deep and obscure content, capping the amount of content it displays in the SERPs. Even if that web page is the only matching content, Google seems to ignore it and choose not to display it. Maybe this is related to the article you read? It's definitely something that's been going on probably since day one if the "random surfer" scenario from the original Page/Brin Stanford paper is taken into account.