homepage Welcome to WebmasterWorld Guest from 23.20.91.134
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google Cache: year old dates for some domains
Whitey




msg:733216
 10:57 pm on Mar 5, 2006 (gmt 0)

1000's of pages on our first set of website's in the English language are showing page caches of more than a year old.

Added to this, many pages are simply not cached at all.

With these sites, we have done everything according to the Google webmaster guidelines.

On the other hand we have a network of the same sites translated into foreign languages and they are happily caching away every few days.

With these sites we are not fully compliant with the G guidelines - but will update them shortly.

Any ideas on how to unblock the first batch?

 

jeremy goodrich




msg:733217
 10:35 pm on Mar 6, 2006 (gmt 0)

For issues with the Google cache showing an old date, best case scenario is to wait till Google sorts it out. As long as the headers on the site(s) are showing the correct last modified dates for the files, then you're doing all you can.

Also, if you could...get more links pointing to your site, that way, the refresh rate (or if your content starts changing more frequently) will go up, which should result in less "stale" caching dates.

Best,

Jeremy

g1smd




msg:733218
 10:48 pm on Mar 6, 2006 (gmt 0)

>> For issues with the Google cache showing an old date, best case scenario is to wait till Google sorts it out. <<

If those pages are Supplemental Results then you will wait forever. Those pages will continue to appear in searches for that old content forever more. They will not be updated.

The same URL may start to appear in searches for the current content, and show a modern cache for those searches, but for searches involving the old content (stuff that is no longer on the real page) they will continue to show a snippet (and often a cache too) that matches that old content.

Google no longer has a completely fresh database. They want to show you content that is current, as well as showing you what the previous versions of a page looked like too; but archive.org does it better.

Google has kept some old cached data for more than 2 years now.

Whitey




msg:733219
 12:04 am on Mar 7, 2006 (gmt 0)

Thanks fellas

I guess as you say we just have to wait until "Big Daddy" settles down.

We submitted via Google site maps, URL's with completely new content and adjusted the URL structures. Each of the 40,000 approx pages refresh with new dynamic content each time it's browsed.

So the hope was to break the problem with supps and a few other things.

Previously we had a lot of dupe content [ say 40% ] , common URL and template structures, interlinking between .co.uk , .ca and .com sites , plus a few other things. One of the sites is clear, the other 2 are about 2 weeks away from being revised.

When do we present ourselves to the reinclusion folks at Google do you think for the 1st one?

Whitey




msg:733220
 1:30 am on Mar 7, 2006 (gmt 0)

Again thanks - seems we neglected to do this so we'll put a script in place to handle date and time.

The content changes on every browse.

Would you suggest any further parameters to be applied to the script?

Whitey




msg:733221
 11:59 am on Mar 8, 2006 (gmt 0)

While we're waiting for the mod date script to go in i bumped into this re supplemental results
Things that you can control, and which may or may not help (but do them anyway): [webmasterworld.com...]

You were optimistic [ i think ] at the time that there was a chance supplemental results might be removed and now you're not.

What about if we follow your procedure, plus a few other things, tidy the lot up per our previous post, apply new URL structures, submit to G Site Maps and put a re inclusion request in - will that save us?

trinorthlighting




msg:733222
 3:40 pm on Mar 8, 2006 (gmt 0)

How are the cache's for these pages on MSN? I started a thread the other day on how MSN is updating quicker. Its seems to be a very large issue that google is having now. Old information...

g1smd




msg:733223
 8:15 pm on Mar 8, 2006 (gmt 0)

Yes, do the things mentioned in the other thread (jeez, what that really two years ago?), so that you know that you site is technically working correctly, then wait for Google to fix their end.

If something on your site is broken, then do fix it.

Whitey




msg:733224
 11:09 pm on Mar 8, 2006 (gmt 0)


g1smd -
You were optimistic [ i think ] at the time that there was a chance supplemental results "could" be removed and now you're not.

Was my above comment correct?

What's bothering me is that there are many informed comments on the supplementary related issues in various forums but i can't find compact guidelines on established resolutions and seperately what are unresolved problem issues.

I guess I'd better move to an on topic forum for this one

g1smd




msg:733225
 11:35 pm on Mar 8, 2006 (gmt 0)

Back then, Google were saying that Supplemental Results would be re-indexed next time the page was crawled. I had that in writing from the Google Helpdesk on multiple occasions.

Two years later, virtually NONE of those listings have been cleaned up (even though they spider and recache the pages every week). Google was wrong; and has made more and more of a mess with those listings.

Whitey




msg:733226
 6:16 am on Mar 9, 2006 (gmt 0)

Did you follow it through with them?

On another issue it took almost one year of my persistance to get a rectification - that's daunting.

I'm wondering how the help areas [ re inclusion etc ] of Google will cope with the fall out from this error and if it's going to cause make support a drawn out affair [ more than usual ]for legit webmasters with legit problems.

Whitey




msg:733227
 10:19 am on Mar 9, 2006 (gmt 0)

[webmasterworld.com...]

Google's 301's not been working correctly for 8 months!

Tough news, if it's true, for those who have re organised their sites recently!

nedguy




msg:733228
 12:41 pm on Mar 9, 2006 (gmt 0)

Hi

First posting - I've wandered in here on MC's invitation (never realised you could register without paying. [mattcutts.com...]

I'm rather alarmed to see the scale and persistance of the old cached data problem, and issue of live pages going supplemental, presented in these threads.

Those of us who use Dmoz know that it's rather like staring at the night sky; it's mostly accurate but it is essentially a historical snapshot of the internet as it was anything up to 2 light years ago.

That's fair enough. It's human-edited.

Search engines don't have that excuse. I think it is completely un-acceptable that SE's should have any data in their systems more than a month old. But I'd settle for a 'real world' max age limit of up to 6 months.

When I have discussed on other forums the appearance of cached pages on Google dating from June 2005 and earlier, I've been pacified by the suggestion that Google just keeps old data, like spare building materials out in the backyard, to use for temporarily filling in holes in the database when it takes sections offline, and it's nothing to worry about.

These threads makes me think that perhaps the problem is a lot more serious than that. I have assumed that one of the key purposes of the Big Daddy rollout is to sort out the problem of old data. It is, after all, something they HAVE to do... before the public notice.

Right now search engines can, and do, get away with 10% relevancy - most ordinary folk are satisfied if they find one relevant result on the first page. The research just carried out in France (http://blog.searchenginewatch.com/blog/060307-100456) indicates they are actually averaging less than 50% relevant search results.

So poor results, in part influenced by old irrelevant data tripping duplicate content filters, don't matter too much... for now.

But what if one of them (eg MS - though 'Windows Live' doesn't look any different to me so far!) broke away from the pack, and showed the public what 75 - 80% relevancy looks like?

That would take the 'can-do-no-wrong' gloss off Google's rickety serps propped up with ancient data.

NG

g1smd




msg:733229
 12:51 pm on Mar 9, 2006 (gmt 0)

If the URL that now issues a 301 redirect is shown as a supplemental result then that result shows forever.

However, as long as the new URL for the content is now indexed and ranking then there is no real problem (as far as Google is concerned - but sometimes you really do not want old contwent to be showing anywhere). If however, the new URL is not indexed and ranking then that is a real prblem...

CygnusX1




msg:733230
 1:06 pm on Mar 9, 2006 (gmt 0)

We are also showing old content, which we don't want. I can only hope that Google will change back to our new pages soon.

Whitey




msg:733231
 12:12 am on Mar 10, 2006 (gmt 0)

Here's the definitive test:

[#*$!.net...]

OK I have had a bit of time to look into this problem on one site and find in this case that all the pages which are in the supplemental index are old pages that do not actually exist any longer (due to a site reorganization eight months ago) and which were 301'd to a new location with new content in most cases.

This confirms WilliamC comments about Big Daddy putting 301'd pages into the supplemental index.

I have verified that clicking on the supplemental listed link does indeed take you to the new page and that it does return a proper 301 response, so it looks like this is a case of Google not handling 301's correctly.

Quote:
#1 Server Response: [#*$!xxx.xxx...]
HTTP Status Code: HTTP/1.1 301 Moved Permanently
Date: Thu, 09 Mar 2006 06:14:10 GMT
Server: Apache/1.3.33 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.10 FrontPage/5.0.2.2635 mod_ssl/2.8.22 OpenSSL/0.9.6b
X-Powered-By: PHP/4.3.10
Location: [xxxxx.xxx...]
Connection: close
Content-Type: text/html
Redirect Target: [xxxxx.xxx...]

#2 Server Response: [xxxxx.xxx...]
HTTP Status Code: HTTP/1.1 200 OK
Date: Thu, 09 Mar 2006 06:14:11 GMT
Server: Apache/1.3.33 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 PHP/4.3.10 FrontPage/5.0.2.2635 mod_ssl/2.8.22 OpenSSL/0.9.6b
X-Powered-By: PHP/4.3.10
Connection: close
Content-Type: text/html

Thanks GoogleGuy : [quote ]

"but I guess the big question is why the hell did it happen in the first place.

Don't they test out these changes before they go live with them?"

g1smd




msg:733232
 1:06 am on Mar 10, 2006 (gmt 0)

Yeah, before they go fully live they test them out on SEOs simply by dropping hints about some IP address that shows a preview of the next version of the index...

g1smd




msg:733233
 1:09 am on Mar 10, 2006 (gmt 0)

The supplemental 301 issue is well-known. Google often does not update the status of pages after you implement the 301 redirect. The old URLs hang around for years.

However, if the new URLs are indexed and ranking there isn't much of a problem (except you may not want the old content to be showing anywhere at all really).

Whitey




msg:733234
 2:20 am on Mar 10, 2006 (gmt 0)

A few things :

1.
Two years later, virtually NONE of those listings have been cleaned up (even though they spider and recache the pages every week). Google was wrong; and has made more and more of a mess with those listings.

That's enough to send me north for the winter [ we're southern hemisphere ]!

Any chance next week could break the drought with what GoogleGuy says?

2. On getting out of the supplementary index my SEO says there are 2 schools of thought on this one [ so it doesn't sound conclusive, but certainly unreliable ]

- has anyone else had success at being pulled out of the supplementary index?

So onto the last line in our strategy - if all else fails - seek help :) - Re inclusion requests :

3. We've got some cgi jump scripts with robots.txt on links from our pages to our partners shopping cart with about 20% common content.

If we go for re inclusion, with everything tidied up is Google going to eye this and reject the application over this issue. What do you think?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved