Welcome to WebmasterWorld Guest from 18.104.22.168
I have been away for a week (just a quick look in, a couple of times during the week, even though I was supposed to be on "holiday") and while I was away Google got rid of the results that I previously referred to as BigDaddy A and B leaving only the experimental results, and the cleaned up version (with the oldest supplementals deleted) left behind, and the "cleaned up" version is the one that is on the vast majority of the datacentres now.
The "same snippet for every page" problem has also been fixed.
Sites that installed a 301 redirect before 2005 June no longer show the redirected URL in the SERPs.
Pages that went 404, and sites that went domain expired, before 2005 June no longer show up in the search results.
New Supplemental Results have appeared for any pages that have changed their status or their content at any time since 2005 June. For pages that are gone, the Supplemental Result has a cache of the final version that was online. For pages that have been updated, the Supplemental Result shows the previous content in the snippet, and the normal result shows current content in the snippet. In both cases the cache is usually only a few days or weeks old.
[edited by: Brett_Tabke at 1:36 pm (utc) on May 19, 2006]
Google HAS updated their Supplemental Results, but Matt Cutts has said that the process is not finished and will continue for several months.
Pages that went 404, and sites that went domain expired, before 2005 June no longer show up in the search results. Those Supplemental Results have been completely cleaned out.
New Supplemental Results have appeared for any pages that have changed their status or their content at any time since 2005 June.
For pages that are gone (404), or domain expired, the Supplemental Result has a cache of the final version that was online. For pages that have been updated, the Supplemental Result shows the previous content in the snippet, and the normal result shows current content in the snippet. In both cases the cache is usually only a few days or weeks old.
Google used to hold on to old data for almost three years. It looks like that maybe they only hold on to it for 9 to 12 months now.
They do this so that you can still find sites that recently went offline. They may also use the data to defeat domain-hopping spammers who put the same content up on a new domain when the old one is found and penalised.
For surfers trying to find fast vanishing information the Supplemental index can hold very useful data. For webmasters trying to get old data off the screen, it is often a pain in the neck. At least Google seems to have seen some sense in cutting back the timescale that they hold on to thaty data for.
I have a site where I have recently removed a word from both the title tag and from the on-page content. That special word was placed there some time ago, ready for this experiment.
The last cache of the page was on May 11th, just a few days before the words were removed.
After just a few days, things started to happen. When searching for the old words, they still appear in the title and snippet. When doing a site: search, the title and snippet reflected only the new content.
Some datacentres were still showing the old May 11th cache with the old words still in there. Other datacentres were showing a cache from May 18th with them gone.
When doing a site: search you always get the new title and description, irrespective of whether the cache is from May 11th (with the words in) or from May 18th (with the words removed).
It is very obvious that the indexing data, the cache copy, and the snippet are all separate pieces of data that are updated at different times. It is also obvious that different datacentres have a different cycle and different priorities.
The May 18th cache was later replaced by a May 20th cache in most (but not all) datacentres, and the cache is being updated to a May 23rd cache right now in most of those.
There are still some datacentres showing a May 11th cache for the page, while others show either May 20th or May 23rd.
It is still true that when searching for the words that are no longer on the page, the page is still returned as a match, and the words still show in the title and in the snippet for that occasion. The site: search shows only the new title and snippet, irrepective of what is in the linked cache.
Google has adopted some sort of short-term "sticky results" where the page is still returned for stuff that was recently on the page (much in the same way that Supplemental Results return a page for what was on it many weeks or months ago (if it has been edited) or was on it many months to years ago (if the page or the domain no longer exists at all).
I am wondering how long it will take Google to forget that those words were ever on that page, and whether it will slip that page into the Supplemental Index for that search time (while leaving it as a normal result for the current indexed content).
I have several of these experiments running, so will compare them all to see what happens. So far they all give the same sort of result.
If the stickyness is very short, then very few people will be misled by out of data information that shows in the snippet.
If the stickyness is long, then you can rank for keywords that are not on your page long after you have removed them and changed your content to something else.
I guess you would think that Google was broken if a page that you could easily find yesterday is nowhere to be found today - so they continue to rank it for a while after the content is changed, even after they have reindexed and ranked the new content too. When you visit the page itself, it is only then that you find out that the content has changed.
If this is indeed what they want to achieve by this then it is surely being ruined anyways by the erratic nature of the SERPs these days. It would also annoy me to find that what I as promised via the snippet when clicking a result wasn't actually on the page. This to means sounds like a search engine is not doing its job. I want consistent, reliable and fresh results. I'm not getting this from the big G.
It is still true that when searching for the words that are no longer on the page, the page is still returned as a match ...
It is likely too soon to determine if the page will continue to be served up for searches containing the word you removed .... but are you sure that your page has no IBL's using that word in text links? If there are any such links, the page may continue to show up in search results.
I have a page on one site which contained the name of a person both in the title and on the page. That person since died, but his business (which has been sold) still exists. Despite the fact that his name has been removed, that page continues to show up when searching for the name of the deceased because there were several IBL's containing text links for the title of the page.
Just food for thought.
I now see the good results spreading slowly to:
"So Big - It deserves its own category."
Don't let us down...
[edited by: petehall at 11:33 pm (utc) on May 24, 2006]
I'm becoming more convinced that quoting datacenters has gotten close to useless because they have so many machines on each datacenter that the results served up are often not the same, with the most obvious differences regularly seeable when comparing the search?/regular way versus the ie?/mcdar way.
"I also note that the promised increase in communication seems to have fizzled out."
Maybe because the folks at the plex have no good news to announce. Or continue to have unresloved problems.
Personally, I prefer now communications of my kind WebmasterWorld fellows and their feedback than the latest generic no value communications from Google's folks.
The search results showed only the new content in both the title and snippet in a site: search, but always showed the old (deleted) content in both the title and the snippet if you made a keyword search for that old content.
Yesterday, all cache copies reverted back to the May 11th date again, and show the old content.
The title and snippet continues to operate as above; what you see depends on what you searched for. In a site: search the title and snippet do NOT match the old cache content.
Previously it was only Supplemental Results that showed this "stickyness" of old content - and kept that content was kept for several years.
Now, normal results show this too; and it will be interesting to see if the stickyness is a reasonable 2 or 3 weeks or an unreasonable 2 or 3 months or more...
Nothing may actually happen but this is something for the conspiracy theorists to chew on.
I'm seeing worse :-P
The one bright spot, Yahoo and MSN are sending more my way (not just percentage wise, but actual visitors). It's weird how over the past week Google has been steadily dropping pages while Yahoo and MSN are both adding them, but who cares, traffic is traffic.