Forum Moderators: Robert Charlton & goodroi
Major Change in Supplemental Result Handling today:
Over the last 18 to 24 months, I have written many times about how a page can appear as a normal result for search terms that are located on the current version of the page, and as a Supplemental Result when you search for words that were on the previous version of the page (but are no longer on the current version of the page).
In the latter case those "old" words also appear in the snippet too. In both cases (old search and new search) the cache is usually just a few weeks old, so it never shows any of the words associated with the "old search".
As of today, the new search is still linking to the new cache, but the "old search" now brings up a cache that is dated just one or days before the date of the last change of content on the page, and therefore the cache DOES now show the old words from the old content.
This is a new thing today, and Google has NOT worked like that at any time in the last two years or more. So, rather than get rid of old supplemental results, Google now gives them more space on their server, now actually keeping the old cache copy for them alive too.
I was hoping that old indexed data with no matching cached page was going to get deleted from Google's index in their currrent tidy up.
However, what they have chosen to do, is not to delete it, but to now keep an older copy of the cache to go with it. This is addition to keeping a new copy of the cache in the normal index.
I have seen this effect on a large number of pages today. It doesn't happen for all sites, may be not all that data is complete yet?
Google begins to look more and more like archive.org every day.
So,if you alter a page, Google will return that page for the current content but it will also return that page if you search for the previous version of the content. Before today, you could only see a modern copy of the cache. Now, you get to see either a new copy or the old copy depending on exactly what you searched for.
I'm seeing 2 sets of DCs each with different sets of results. Interestingly one is using the DMOZ title for the page and the other showing the actual page title.
My study is pretty limited. The DC watch experts would have a better picture of this.
Could this be the beginnings of a real update or is there such a thing anymore?
I am starting to see the problem I think.
I told ya yesterday that I had a feeling that something good is happening and Brett & Matt might have something to announce soon.
And it happened, Brett is here with his announcement and we are just waiting few words from Matt too.
Let those good news coming ;-)
Long live WebmasterWorld community!
What is depressing though is sitting down to do a few test searches for terms no longer on a site. It brings up a lot of supplementals for which there is a current (non-supplemental) listing also. It's just sad to see a search engine deliberately serve up false results. Google knows, via crawling within the past 24 hours even, these words are no longer on the page (and nothing links with those words) but still they show results they know are a lie.
Of course the engineers might be panicing right now and dumping the archive back into the goop right this minute...
then i did a search on "place widgets[popular mispelling]"
This dug up around 90% supplementals and buried the good results, where previously the results were good.
If the ongoing volatility is driving webmasters to the "funny farm" - do you really think users will persist with this?
I think it warrants some serious Google dialogue again. Confidence is urgently required :)
There must be something in this, as in Google's words, a supplemental result is one that doesn't meet certain criteria to be included in the main index.
Now, if that is the case, then does this mean that the product is not up to standard, the page is not up to standard, the URL, header, content, type of script { html, css, etc } does not conform.
Wow, this is mind boggling, when a single result doesn't show as non-supplemental!
Did a site:www.oursite.com in the "normal" Google.com. First result is the home page, all looks fine. It is listed as www.oursite.com/. The title is correctly our title and text snippet comes from Google directory = dmoz.
I scroll down and the tenth result is also the home page listed as www.oursite.com/home.aspx But listen to this: the title of the search result consist of a row of six links that sit at the very top of our page! Like this:
Home ¦ Whatever Page ¦ Another Page ¦ Etc
The text snippet consist of the first nine links in our left column of the page. That is of course less weird but I haven't seen that either before.
I mouseover the cached link and get [72.14.203.104...] but if I go directly to that DC I won't get the same result when performing a site:www.oursite.com
Google will pull information from the on-page text, or from the ODP, if those elements are misssing or sub-standard.
Check whether the page has the title and meta description properly filled in (check that the tags are correctly opened and closed too, it might be a tag typo) and then check what Google has in the cached version of the page (and how old it is).
[edited by: g1smd at 2:10 pm (utc) on May 2, 2006]
Yes everything is correct. Read my post again. The home page appears as #1 in the search result looking 100% normal but then comes AGAIN on #10 looking weird.
The first home page is cached 27 April and the second one on the 28th, they are identical and nothing was changed on the page.
[edited by: Susanne at 2:16 pm (utc) on May 2, 2006]
My homepage seems to have lost its recent cache of 21st (I think) of April 2006 and reverted back to March 2005 in some DCs while in others there is no entry.
This is not a new phenomenon as it used to happen all the time - also some of us will remember it happened to MC blog.
The only thing that kept me going recently was the new calculation of PR which looked like PR had returned to my site - however, as pointed out in other threads even banned/delisted sites are showing PR now so bang goes that hope.
Why Google has started to show PR for these sites and cache again (appears to be old cache for the sites I know about it) I cannot even attempt to guess.
It is very important that when you link to an index page that you never mention the index file filename in the link. End the link with the trailing / on the end of the URL. Omit the actual filename.
It has become apparent that search engines don't punish for duplicate content, they simply ignore the duplicate page. And I have not noticed that our page rank has suffered due to duplicate home pages. Actually I think we have a handfull of home pages listed, and touch wood, we are doing fine!
To get back to the weird search result, any other ideas?
Google should/will learn the canonical version and ignore the other versions for indexing purposes.
Links to any version of the page should then be treated as one and the same.
If this was not the case every site would have duplicate homepages listed on a site: search as external links tend to use various versions of the full URL.
Best advice is as post above to always link internally to www.mysite.com/ or www.mysite.com/folder/ using absolute links and no index.htm or home.htm. Ask external links to do the same, even adwords etc.
"I mouseover the cached link and get [72.14.203.104...] but if I go directly to that DC I won't get the same result when performing a site:www.oursite.com "
Happened the same for several of us. Then I asked and few kind members had explained that behind each datacenters IP there might be several boxes serving different serps.
Matt cutts mentioned something to the effect:
- different datacenters get different data at different times.
- sites might rank differently on each datacenters.
So it all depend upon which datacenters (and box) you are hitting and at which time of the day.
I called it at that time "Random Serps" ;-)
I hope this helps.
"I think tha reseller, being the jovial person he is, has 1st May in Denmark as April Fool's Day, and is teasing everyone ;-) "
C'mon give me a break :-)
Ok. Different between some of you friends and I is; I read Brett's lips, you don't ;-)
And for the first time, and only on WebmasterWorld, this thread is on the homepage!
Honestly, wasn't that an announcement ;-)
Different between some of you friends and I is; I read Brett's lips, you don't ;-)
Vagueness & statements that could be taken as insulting aside, you imply that Brett has already made some sort of announcement, reseller. My & others' attention has been caught, and you have been asked to clarify. Once again I ask, what are you talking about?
And it happened, Brett is here with his announcement [...]
But, where?
<added>
Doesn't seem to be here: User profile: Brett Tabke [webmasterworld.com]
</added>
Such fun and to think I would have been on vacation if not for our sick cat. This is an addiction. ;)
"Official" and informational sites have suddenly outranked commercial sites that were placed highly in the SERP
This is exactly what I am seeing. Great for me with a top informational site. It puts my biggest site homepage at #3 for my most valuable key. Sure I know I make most of my $$ and get most of my visitors through the long tail. It's really just an ego thing. But it still feels great!
[edited by: annej at 7:43 pm (utc) on May 2, 2006]
The latest changes to Google Everflux are very encouraging for the sector I watch, where the sites which used hidden keywords and cloaking have disappeared from the results (from having been in the top 10 during BD) for major keyword1-keyword2 searches.
But what interests me most, is that the canonical problem for my site appears to be finally resolved, with a site:www.mysite search showing the home page at position 1, which is the logical place for an index page in a site search. Previously, directories were appearing before the index page, and serp positions very much reflected where my home page appeared in a site search. There appeared to be a definite connection between the two, as all the sites previously appearing ahead of me in the serps had their home pages correctly identified by google as the entry point to the site.
Now that google has finally recognised the index page of my site as the entry point, it has also had a positive effect on an allintext, allinanchor and allinurl search, moving up a a number of notches and now back in the top 5.
The combination of correctly identifying my index page and the knock-on effect that had on the allin command, have brought my site back into its rightful positions in the serps.
The results in my sector (travel) now also reflect a far better balance between commercial and information sites, with information sites garnering 10% to 20% of the top 20 results, which is an acceptable mix for a search where the user is untimately wanting to buy/book a product or service.
These swing the opposite way when adding a word such as 'info' or 'guide' to the search string, with the educational and information sites rightly taking the lion's share of results.
Very encouraging changes all round.