The "NEW Google" and How It Works

The 'NEW Google" and How it works - My musings...

Month after month we see new threads regarding the 'latest serp changes', penalties, and the general overall woes of webmasters trying to figure out - "what happened?".

One thing I have yet to read (maybe it is posted somewhere - but I've yet to find it) is an evaluation of just 'how Google works' since the introduction of their Big Daddy system.

Here are my musings...

History:

Years ago, Google preformed what was referred to as the "Google Dance". The Google Dance happened approximately every 30 to 40 days and it was then you could see what new pages were added and the resulting serp changes to Google's database. At that time they had about 10 datacenters.

The next 'big' systematic change came around the time of the infamous "Florida Update". It was after that that Google started using what came to be refrred to as 'Freshbot' and 'Deepbot' and Google moved away from their ~ montly updates to what came to be known as "everflux". At this time new pages added to your site would show up (sometimes) within hours of upload and linking. Google grew to 56 active datacenters at that time.

BUT - due to unpresidented growth of the internet AND Google's inefficient system of purging their databases of 'dead - long gone - pages', their databases soon grew to an unmanageable size.

Current:

Google's solution to their buldging, out of control databases - the Big Daddy system of data handling!

Since the introduction of Big Daddy, month after month, websites (good websites) have been bouncing in and out of Googles serps AND sometimes even in and out of Googles index entirely.

I think it is about time that we consider just how this 'new system' works. Maybe then, we would have a better chance at understanding what is really happening to our sites.

Observations - since Big Daddy:

Supplimental database - Usage of a Supplimental database on a level never seen before.
Google spiders - Google spidering has changed (reduced). Google said relying on 'caching of pages more'.
Data pushes - Data Refreshes in lingo used by G employees.

Here are my speculations as to How the "New Google Works":

THE Google Index has now become 3 different Indexes OR a Three Tiered Index. There now is the Supplimental Index, the Secondary Index (where site:, link:,etc searches come from), and the PRIMARY INDEX (where the serps results come from).

I think that it is THE WAY these Indices are populated - that is problematic for the Google Search Engine results.

I have come to believe that internal Google spiders crawl Googles Secondary Index accumulating data for the 'next' data refresh. The Google spiders that you and I see in our server logs return their findings to either the SUpplimental Index or the Secondary Index.

I speculate that these 'internal Google spiders' - Primary Index Bots (I'll refer to them as PIBots) either are dropping data (pages) OR are not fast enough to accumulate ALL of the data from the Secondary Index in time for the next data refresh.

I speculate that each data refresh totally replaces the previous PRIMARY INDEX. Therefore, the resulting 'new' Primary Index is incomplete and many websites/pages have inadvertantly been OMITTED. This might explain why some webmasters report bobbing in and out and up and down in the Google serps.

Spidering issues would also explain why some webmasters are seeing a correlation with sitemap SPIDERING. Perhaps it is the SPIDERING that is the issue - NOT the sitemaps - per se.

The major problem with this hypothesis is that IF this is what indeed is happening (an internal Google issue) it leaves us POWERLESS to change things on our end and people can't deal with being powerless.

I am hoping that this will spark off some further discussion on this topic. Perhaps the 'needle in the haystack' we have been seeking (an answer to the un-stable Google) can be found in THIS haystack!

Caryl

PS - Yesterday I created a little tool to monitor my site for Google spidering and the IP addresses each spider originates from - perhaps I will find a 'clue' there - who knows...

The "NEW Google" and How It Works

My musings...

caryl

gehrlekrona

johnhh

steveb

AndyA

johnhh

kevsh

tedster

caryl

caryl

AndyA

ALbino

annej

ichthyous

pageoneresults

g1smd

tedster

SullySEO

CainIV

tedster

g1smd

cabier

caryl

cabier

claus

cabier

claus

g1smd

MThiessen

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week