In an attempt to track the number of pages on my website versus the pages that have been indexed, I've been recording the stats found
on the Sitemaps page (Webmaster Tools, under Optimization). When Index Status was added, I started plotting those numbers as well.
I've been in the process of refreshing my website and adding new content. With the new pages uploaded, I wanted to update just a
couple of sitemaps to see the effect on my graph. I got quite a surprise.
On March 25th, I recorded the following:
From the WMT >> Optimization >> Sitemaps
Sitemap Entries: 1,927,042
Indexed: 2,435,647
Note: I don't have sitemaps for all my pages
From the WMT >> Health >> Index Status
Crawled: 3,894,297
Indexed: 1,385,951
On March 31st, I updated 2 existing sitemaps (sm_ut.xml, sm_ut01.xml) and added 2 new sitemaps (sm_ut02.xml & sm_ut03.xml). I used
an online validator to verify the sitemaps - no errors. The size of the sitemaps and their location:
Sitemap Location
http://www.example.com/sm_ut.xml: 7,072 entries
http://www.example.com/sm_ut01.xml: 6,936 entries
http://www.example.com/sm_ut02.xml: 6,959 entries
http://www.example.com/sm_ut03.xml: 6,760 entries
Total entries: 27,727
On April 1st (and there's nothing funny about it), I found the following:
From the WMT >> Optimization >> Sitemaps
Sitemap Entries: 1,026,611
Indexed: 1,326,025
From the WMT >> Health >> Index Status
Crawled: 3,892,171
Indexed: 1,385,951
Updating/adding the 27,727 entries in the four sitemaps caused Google to lose track of 900,431 sitemap entries and the total number
of indexed pages dropped by 1,109,622 pages. Also notice that on the Index Status page, the number of pages Ever Crawled also dropped
by 2,126 pages.
Today (April 11th), it hasn't improved much:
From the WMT >> Optimization >> Sitemaps
Sitemap Entries: 1,026,611
Indexed: 1,330,904
From the WMT >> Health >> Index Status
Crawled: 3,894,297
Indexed: 1,452,801
I've been trying to understand what happened, but without any success. At this point, I assume that it's some kind of bug with
Google's code.
Although I'm not certain of it, I believe that I've seen this behavior before. If you look at the graph between March & May 2012,
you'll see that the green line (Sitemap Indexed) has a slight incline. During this time, I was trying to keep my sitemaps up-to-
date. In mid-June, I had an idea and stopped updating my sitemaps - you'll notice that the rate of indexing seemed to make a significant
improvement.
For the first time in a year, I made the first update to any of my sitemaps (the four mentioned above) and definitely feel like I'm
being punished for it. Does anybody have an idea? I have more sitemaps to upload, but I think it would be a real mistake without an
understanding of what's happened. If I update a few more sitemaps, my site will probably disappear altogether.
While I'm on the subject: I haven't been able to understand the difference between the numbers shown on the Sitemap page and the
Advanced Index Status page. Looking at my chart, it seems certain that Pages Indexed from the Sitemap page and the Total Pages
Indexed from the Indexed Status page must come from different points in Google's process (not only are the counts different, but
their behavior over time are also different).
Has anybody found documentation or other posts on the difference between the two - I haven't had any luck finding anything.
Any ideas? Is there any way to fix or work-around this situation?
[edited by: engine at 8:43 am (utc) on Apr 17, 2013]
[edit reason] no specifics, thanks, examplified [/edit]