| This 205 message thread spans 7 pages: < < 205 ( 1  3 4 5 6 7 ) > > || |
|Supplemental Club: Big Daddy - Part 2|
< continued from [webmasterworld.com...] >
One thing to watch for: HOURLY fluctuations.
After years long slow advances, my main page reached #13 in Google for the main single KW.
Suddenly it dropped to #16, then 4 hours later it was right back at #13.
Same thing with a 2-word key-phrase. From #2 or #3 I fell back to #6.
4 hours later, like the above, it was right back.
Some of this is data center switching I'm sure.
Then again maybe they use old data while they polish up the new.
All in all, Big Daddy has not hurt my site yet (knock on wood). -Larry
[edited by: tedster at 6:48 pm (utc) on Mar. 6, 2006]
I go one step further and suggest a new forum for people that want to paste datacentres in here all day. It detracts from what is going on if you always are watching slight variations in the vast number of data centers google has. I for one find it incredibly annoying.
>>> This supplemental-problem is not specific for non-english sites, is it?
Are the <RK> values for supplimental pages that were previously normal - if so then that make sense. Probably best to continue in the relevant thread though
I know Jim is still looking into these.
The pages I have that have gone supplemental could easily be considered dupe content. Several others are url only and could also be reasonably considered to be dupe content.
On one site where dupe content went supplemental, several pages came back once unique content was added and they were recrawled. I am finding this a fun sandbox to test dupe content and at what point it becomes dupe content.
Larry - I think this observation of the "moving feast" simply depends on which data centre you're locking onto through search or otherwise.
On one we have 42,000 pages indexed [ all 1 year plus pages & supps ] , another 23,200 [ same old pages - over 1 year old & supps ] , another 974 [ 50-60% are recent pages - no supps ].
Too early to call I'd say - wait 'til the feast has settled over the next few weeks and then do the analysis
[Supplement pages are what google algorithm considers unwanted on a site which are affected by various reasons. one primary reason is duplicate content.]
Maybe - but i can see plenty of No1 positions holding supplemental results over non supplemental for the same keyword. So in my view there are other reasons that rank the importance of those supp. pages.
I have been thinking about this.
It seems to me there are a few common factors.
1.The site's affected have all seem to be authority sites and white hat.
2.The sites all seem to have 10,000+ pages in google index.
3. No one seems to be complaining that these supplemental pages came from nowhere.They were all at one stage a part of our sites. Some pages still are part of the site. Some pages were duplicated by mistake. Some pages were published by mistake and some pages are pages that have been orphaned.
Ok if everyones in agreement with this I think then we need to take responsibility for our own content. I am not sure after reading the following article again that google is not completely correct in keeping this content in its index to achieve its goal of mapping the internet
I think the question is how best to manage the content that is now supplemental.
Despite what has been posted here earlier, this has been going on for years. It's not "new", and it is a kiss of death. Once a page is labelled supplemental, it stays that way forever... even if content on that same URL is crawled tomorrow and ALSO placed in the index new/fresh/non-supplemental. The supplemental listings will continue to sit in the background, as duplicates, festering and (usually) pulling down your rankings.
Something with Big Daddy may have inadvertently caused a mass of supplementals, but this has happened many times before and it has never been "fixed" for any of the other sites it has happened to. If it just happened to you, you aren't special, you have lots of company.
Proposed Solution for Supplemental issues.
Initially I just though 404 all these pages in supplemental but again this is ducking the issue.
To 404 is not claiming responsiblity for your content and could damage architecture of the internet. This leaves in my opinion four choices.
1. Where the supplemental has gone to that status because of a poor crawl of you should leave it alone and it should come back no problem.
2.Where your supplemental result is seen because of orphaned page. You should try to reinclude the content back into your site unless you have duplicated it elsewhere.
3.Where you have duplicated the content or orphaned and then duplicated the page 301 to the replacement content or alternatively post a page saying "this page has been moved to <a href="newurl">new url</a>.
4.Where you can do none of the above 404.
The huge dilema for me is the 301 issue. Because its been so abused. Thats why I posted an alternative method which again drew from inspiration from W3c. This page [w3.org...] If you go this method I would think it would then be neccessary to have a linked list on the site of any pages that have been moved to prevent them beeing orphaned.
Or I could just go down the pub. Maybe I'ts "just a glitch".
|Ok if everyones in agreement with this I think then we need to take responsibility for our own content. I am not sure after reading the following article again that google is not completely correct in keeping this content in its index to achieve its goal of mapping the internet |
I dont think anyone is upset that their orphaned indexed links, duplicate urls, etc are in supplemental index now. What people are upset about is that perfectly fine, indexable, recently crawled content, has all been thrown out the window and put into the supp index.
It just seems strange for all these people to suddenly be in supp except for their home page. Its prob sites, including mine, that had problems with canonicalization. To throw an entire site into the supp index for this is little scary though.
Looking at all the links in the supp, most of them do look like they belong there. Where, though, is all my valid content! Links that were valid and highly ranked and highly crawled, numerous times in the month of feb and march, are not showing at all in the new index.
To me thats the main concern..
@Pirates: NONE of your soloutions will work.
Once a page has gone supplemental Google takes NO notice whatsoever as to what happens on the site itself regarding that page, ever again (as far as updating the supplemental database goes).
They will keep their copy forever. It will not be updated, nor will it be deleted. There are still cached pages for supplemental pages out there for sites that were taken down and their hosting cancelled more than a year ago.
This other thread [webmasterworld.com] is also useful.
[edited by: jatar_k at 11:47 pm (utc) on Mar. 6, 2006]
One of our site is "banned" too. Just today started to not show up. The site is not entirely put on suplemental, it's rather, that ONLY the suplemental are still in the index. Our main pages are simply not in the index, not even if searching for the URL.
Somebody wrote, that GG said this is a glitch.. couldn't find his original statement. Somebody could make me feel better and show the way to this praisen words? ;)
Thanks to all, it was sort of recomforting to come here, and find others with the same problem. ;)
"Once a page has gone supplemental Google takes NO notice whatsoever as to what happens on the site itself regarding that page, ever again (as far as updating the supplemental database goes).
They will keep their copy forever. It will not be updated, nor will it be deleted. There are sill cached pages for supplemental pages out their for sites that were taken down and their hosting cancelled more than a year ago. "
So what do you do? Make all new pages and URL's?
If you put new content on the page, that content may show up as a normal result, but the page will still show up as a Supplemental Result if you search for any words that were in the old content that are not in the new content.
If you put up a new page at a new URL, Google will still show the old content, for the old URL as a Supplemental Result for ever more.
There is NOTHING you can do to influence things. The Supplemental database is a garbage dump of old stuff that shows forever, and is never updated.
The fact that "current content" is getting tossed in there right now, and for the last week or so, is a "glitch" as acknowledged by GoogleGuy only yesterday; but there have been many other problems with Supplemental Results for more than two years now, and so far there is ZERO action on fixing any of those.
I have an idea: Has anybody used the Google URL Removal tool to "delete" Supplementals?
I used this tool months ago.....
>kiss of death
I have had pages go supplemental within the last 12 months, then come back and rank. With these pages it was solely about dupe content.
The removal tool merely "hides" things for 3 or 6 months and then Google sticks them right back in public view... even if the page has gone 404, or the whole domain ceases to exist.
As the pages are added back in, there is no mechanism in the Google indexer to do even a most basic sanity check as to whether the URL should be added back in, or merely thrown away forever (because it no longer exists). All data is just added back in, as if nothing had happened (actually it isn't "added" back in, it is just "unhidden".
Err, additionally, I don't think the Google Removal Tool touches Supplemental Results at all, so the answer is still no.
It's nice to see all these interesting questions, but believe me, some of us went through this 12 to 18 months ago, and Google still have not fixed a thing.
[edited by: g1smd at 11:36 pm (utc) on Mar. 6, 2006]
" I have an idea: Has anybody used the Google URL Removal tool to "delete" Supplementals?
I used this tool months ago..... "
Yep. It works BUT in the context mentioned by previous posters.
That is, you can take the page out of whatever the current index is but it may reappear.
For example, I removed a number of supplementals and they are gone from the most recent index but, when Google for whatever reason reverts to...what? the old index, the "master index?" that never gets wiped, they are back in.
Dunno how long it takes for them to be have a stake driven through their hearts.
I have an important question. I hope to reply me.
My site is indexed, but I saw in Google search results from 47th page results are only LINKS to my site. No Description or Title, just url links. WHY? :(
There's always a solution.............my friend.
Just stop thinking the glass is half empty OK mate.
Ok lets get the link right this time.
Does not prevent new page showing at all.
Yep there is a history to the original page but if your mapping the internet there has to be. If your mapping the internet of course the original content the dupe content will always be there in the index. By publishing the page moved and linking to it (thats vital) you eventually tell google the new content of the page. I am not sure you can do this with a 301 as no new file has replaced. Have you tried this method g1smd?
"Go into a supplemental index" is like "go into a coma". Hope oneday it will wake up!
are links coming off of Supplemental pages worth anything? Thanks
"I have had pages go supplemental within the last 12 months, then come back and rank."
Again, that means nothing in terms of the supplemental. The supplemental is still there, hidden from view by the freshly crawled version of the page.
Think parallel universes. Once pages go into the supplemental index they stay there forever (literally forever, so far as we know to date). You can get content on the same URL to show in the active index, but that does nothing to change the existence of the page (on the same URL) that is listed in the supplemental index.
>> Have you tried this method g1smd? <<
I have had the "luxury" of having six whole domains full of completely unrelated content (by ownership, by topic, by anything you care to measure) to play with for the last couple of years. None of them are my own sites.
For one domain, we showed Google a 404 error message for all of the pages that no longer existed, and did that for about 18 months, and they are all still showing as Supplemental Results even after all this time. The site content was combined with another site, and the original domain has now expired, but still the pages show as Supplemental Results, even six months on from that.
On another domain, once it was spotted that the site was being indexed, we put all of it behind a password login (as the site was a development server, with test content that was never meant to be indexed at all), and two years later all the pages still show as Supplemental Results. Did I mention that nearly a year ago, the domain was taken completely offline, and there isn't even a DNS record for it anymore? Yet there they are, all the pages shown in the Supplemental Index still (actually the number reduced by a few hundred a few months ago, but thousands are still there). We want them gone, because the content is incomplete, out of date, and just plain wrong. It was a test domain for testing scripting and stylesheet changes, etc, using old data for the content.
Another place that Supplemental Results occurred was a site where both both .com and .co.uk accessed the same content, but the .co.uk was a 301 redirect to the .com using mod_rewrite for all pages.
All was well, for about 18 months, until some moron at the hosting company deleted the .htaccess file. Three weeks later I noticed that Google has a bunch of .co.uk pages indexed, and so immediately looked to see what the problem was, found that the .htaccess file was missing, and then put 301 redirect back online.
Nine months later, and Google still lists all those .co.uk pages as Supplemental Results. They will not go away.
Two whole sites were affected the same way, by the host meddling with the .htaccess files. On the other one, the co.uk was then simply parked with the hosting company as the owner is letting the domain expire and only going to use the .com in the future. Yep. Those .co.uk pages are still listed, even though the URLs all point to the hosters "Error Message" page. The .com site is perfectly listed, but we want the .co.uk to go away because they show a wrong phone number for the business and wrong pricing for the products.
(This next one is the site that I wrote about extensively, last Summer and Autumn, here at WebmasterWorld). On another domain, there was duplicate content at both www and non-www. The 301 redirect was added, and Google started to pick up more of the non-www pages and slowly delisted all the www pages. After about three months the listings were "perfect".
After another two months a whole pile of www pages suddenly re-appeared in the index as Supplemental pages and continue to show.
This is the freaky bit. The www pages show in a site:www.domain.com search but they do NOT show up in a site:domain.com search.
Lengthy correspondace with Google's HelpDesk showed that no-one undrstood the problem as described, and in fact the first four answers received from them were standard "cut and paste" crap that had nothing to do with the question that was actually asked. After two months of dialogue, they still didn't get it. At all.
Finally, a site that had loads of pages that didn't need to be indexed. They were all excluded using robots.txt. I just checked and the whole lot are showing as "Supplemental Results" with a cache date that is a few months before the date that the robots.txt file was added.
Google are seemingly saying to me "Heh, we know you don't want those pages indexed, so what we'll do is show our users what was on those pages at a time when they were allowed to be indexed. Here is what the page looked like a year ago."
And, if that is their policy, then it sucks. The reason the URLs were recently (if you can call a year ago, recent) added to robots.txt was simply "oh #*$!, we don't need any of that stuff crawled, or indexed at all".
So, yes, I have tried a lot of things. You might want to follow some of the story last summer in the older update threads as well as have a look at Post #400 [webmasterworld.com] linked in that other thread, and related materials.
On several of the sites the Google Removal Tool has also been used. After the standard 3 or 6 months, all of the pages reappeared as if nothing had happened.
Google makes no check on the URL to see if the page has changed, there is a 301 redirect there, the page is now 404, or the whole domain has gone offline, or even if the domain has expired. They completely ignore the current status of the page, and just add it back into the Supplemental Results.
[edited by: g1smd at 12:40 am (utc) on Mar. 7, 2006]
"Think parallel universes. Once pages go into the supplemental index they stay there forever (literally forever, so far as we know to date). You can get content on the same URL to show in the active index, but that does nothing to change the existence of the page (on the same URL) that is listed in the supplemental index. "
Steve, I beg to differ in this instance. GG said himself that this is a glitch and will be corrected within a week or so. There are some major major sites losing all of their original content. This isn't the usual duplicate contents going to supplementals update. This is a glitch in the algorithm.
@steveb: Have you got a combination that I didn't cover in my post (#54) just above?
No mate, I see why your down. But for me this proves the theory and doubts I had on 301. Recently Brett banned all robots from this site webmasterworld. Perhaps some of the domains you mention are so damaged a similar action would be required. I think there is an underworld on the internet thriving on supplemental results, real scum that deliberately link there sites to it to try and trip up competitive sites in results and gain benefit , but hey thats another story.
Clearly Google need to outline to people who wish to preserve their original content for the internet some guidelines on best proceedure for controlling and managing our supplemental results and ideally bringing them back under the site owners control safely.
Ok, other than site:domain.com, can I find these previously supplemental pages in Google's parallel universe?
If you have your www and non-www pages both accessible on your site (with 200 status) then you have "duplicate content". In this case you must use the 301 redirect. It will get the whole site properly listed and fully indexed as ONE version.
If you do not use the redirect you will end up with split PR and many pages that show as URL-only, and many pages that show as supplemental for both the www and the non-www version, and maybe quite a few pages that fail to get indexed at all under either version.
Once you install the redirect, one version (say, www) will rapidly become fully indexed. However, if any of the pages at the "other" version (say, non-www) are Supplemental then they will also hang around forever.
I don't think they pull down the rankings of the pages that you did want to be indexed and listed, but they do show old content in the snippet and cache, content that may be so out of date that you really do not want it showing any more.
This is the sort of thing that concerns me. No issue with www. http till black hats started linking and hijacking it. No issue with supplementals till black hats started using them. Who's controlling results google or the scum?
I think I know more than a few black hat tricks maybe if we start posting them here we can teach the engine.
I understand that it now appears we are talking about two different things here.
1. We are talking about sites that just a few days ago had all pages go supplemental except the homepage. GG says that Google is aware of the problem and working on it.
2. The other issue we are talking about has to do with old pages that have been replaced by new pages or just don't exist at all anymore.
Excluding the first issue above, Googles approach to supplemental pages is very dumb in my opinion. If a page is removed and thus when you try to pull it up and you get a true 404 error, wouldn't it make the most sense after about 30 days to just remove it from Google's index. And what is the point of having a removal tool that only hides the page. Again, if someone asks for it to be removed and the page serves up a true 404, why not just remove it from the index altogether.
Google seems to be confused about it's mission to index the web and I.A. Archiver's mission to create an archive of the web. I mean how hard is it to comprehend that if I ask (via the removal tool or just by deleteing the page) for a page to be removed and I am indeed serving a true 404, that I want that page gone from the index. It really doesn't take a PHD to understand that that page should be deleted because it is no longer useful to me and thus certainly won't be useful to Google.
Another way to look at it is like this. If I am in a restaurant and the serve me something I don't like and I tell the waitress to take it away, would it make any sense for her to bring it back in 30 minutes to see if I am once again interested in it. Do you thing anyone else would want it after someone has already picked it over.....I doubt, at least not if they are a paying customer rather than some bum rummage through the garbage for scraps.
Could an old domian alias that I forgot about (until 1/2 hour ago) cause a dup penalty?
| This 205 message thread spans 7 pages: < < 205 ( 1  3 4 5 6 7 ) > > |