homepage Welcome to WebmasterWorld Guest from 23.23.57.182
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 205 message thread spans 7 pages: < < 205 ( 1 2 [3] 4 5 6 7 > >     
Supplemental Club: Big Daddy - Part 2
larryhatch

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 33386 posted 10:39 am on Mar 6, 2006 (gmt 0)

< continued from [webmasterworld.com...] >

One thing to watch for: HOURLY fluctuations.

After years long slow advances, my main page reached #13 in Google for the main single KW.
Suddenly it dropped to #16, then 4 hours later it was right back at #13.

Same thing with a 2-word key-phrase. From #2 or #3 I fell back to #6.
4 hours later, like the above, it was right back.

Some of this is data center switching I'm sure.
Then again maybe they use old data while they polish up the new.
All in all, Big Daddy has not hurt my site yet (knock on wood). -Larry

[edited by: tedster at 6:48 pm (utc) on Mar. 6, 2006]

 

quarryshark

5+ Year Member



 
Msg#: 33386 posted 3:23 am on Mar 7, 2006 (gmt 0)

Could an old domian alias that I forgot about (until 1/2 hour ago) cause a dup penalty?

Grinler

10+ Year Member



 
Msg#: 33386 posted 3:31 am on Mar 7, 2006 (gmt 0)

Hey GG, how about a quick update for us floundering souls here. Even a quick, we are here..we have not forgotten you.. we are working on it would make us all feel a heck of a lot better.

lammert

WebmasterWorld Senior Member lammert us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 33386 posted 3:46 am on Mar 7, 2006 (gmt 0)

Again, if someone asks for it to be removed and the page serves up a true 404, why not just remove it from the index altogether.

Database design could be the reason for this. It is often much easier to write a fast database which allows inserts only than a database that allows inserts and deletes. In the latter case you need an administration system of the empty holes and a system to fill in those holes with new entries (pages). In the old days DBase .DBF files were a common way to store data. You couldn't delete records from a .DBF file. You could only flag it as deleted, but DBase wouldn't reuse the space again. The PACK command was needed to convert the old database file to a new one.

<speculation starts here>
The search engine index is primarily designed to store pages, pages and even more pages. The rate at which new pages occur is higher than the rate at which old pages disappear. This was at least the situation a few years ago. As a programmer I wouldn't be surprised if Google designed the database to be add-only, and solved the delete problem just as DBase did, by marking unwanted records with a flag. The way the removal tool works could be some evidence for this. Since a few years there are scrapers which are capable of throwing up tens- or hundreds of thousand pages on disposable domains. Google deletes them when they are reported but the spammers just create a dozen new sites with the same content.

This could have made the mark-to-delete system unstable or overloaded, causing the supplementals problem. I would say that the supplemental problem appeared almost at the same time as large scraper sites appeared on the internet.

Maybe the BigDaddy roll-out is a large PACK operation to compress the current index to one without the holes, or a roll-out with another database structure with easier ways to delete content.
</end of speculation>

Web_speed



 
Msg#: 33386 posted 4:09 am on Mar 7, 2006 (gmt 0)

There is not much going to be left of Google once this update/algo change/index rebuild/page rank intensive care(?) is finally over.....boy what a mess.

This has been going on for almost a year now.... so they killed a couple of spam MFA sites, and a couple of link farms, BUT almost 85% of all other sites online got (or sooner or later will get) killed in the cross fire. Great job!

Give us back the 2003-2004 good old Google, role back to the good old spidering/algo/filtering system....what we have nowadays is a mere shadow. A supplemental results swamp. And i am not happy saying this :(

CainIV

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 33386 posted 6:05 am on Mar 7, 2006 (gmt 0)

So basically we have a situation that Google is not crawling/indexing pages in a site.

Supplementals are created many times from dupe content - titles and content, not only dependent on whether pages are indexed daily or not. Simply they are what steveb has pointed out, a massive set of blackhole gargabe which continues to linger and affect sites in all the wrong ways. Google creates them when for some reason it does not deem them useful to the set of pages which comprise any given site.

For some reason lately either a threshold has been hit, allowing many more pages to enter supplemental hell than usual, or there is a massive glitch at Google. From what we hear, (b) is most likely.

To date I have never had a page leave supplemental status that I know of.

As per the Google removal tool, this is another black hole tool designed to make you think your page has been cleanly removed from the index, when in fact, it now also resides in the other realm with little to no chance of further being removed or 404'ed. Pages removed 7 months ago using that tool now show in the Google index for me.

Just my 0.02

Kirby

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 33386 posted 6:20 am on Mar 7, 2006 (gmt 0)

>To date I have never had a page leave supplemental status that I know of.

Perhaps what I am seeing today is from a different database than what I saw last week. I dont know because I didnt think to check the DC last week, but I had pages go supplemental last week that are back today.

Whether or not Google has copies marked "Supplemental" in some black hole is beyond me, but I do know that when I corrected the dupe content issue, the pages were no longer marked "Supplemental" in site:domain.com query on a BD dc today.

Halfdeck

5+ Year Member



 
Msg#: 33386 posted 6:30 am on Mar 7, 2006 (gmt 0)

I think the point that some are making here is not that supplementals can't return to the main index but that Google has a separate database of docIds, urlHash, cache dates, and HTML code, and that once a record is inserted into this database it's never deleted.

Ride45

10+ Year Member



 
Msg#: 33386 posted 7:01 am on Mar 7, 2006 (gmt 0)

Yeah, it's 2 separate issues:

1. The fact that Google has a supplemental DB altogether with pages that should no longer exist in any index.

-> this one has been an ongoing conversation for 2 years

2. The fact that many webmasters with a high number of unique pages and which follow Google webmaster guidelines have suddenly found those pages part of the supplmenetal DB and out of the main index.

-> This has caused the most immediate concern and sparked this thread, but it's one that Google is now aware of and will be addressed as MC states within a week or so.. so may as well just relax until you hear back or see BD results otherwise be updated to have your pages back in the main index.

It's a temp thing.. not a penalty. If you have valuable content that is worthy of the main index, it will appear back in the main index.. It's just a matter of time. Obviously we don't know how much time and that is what creates the most anxiety.. but time will tell.
It's not a duplicate content thing either.. Many legitimate sites, the source of great content and with 301 redirects, are in the supp club right now too until Google looks at and solves the glitch.

steveb

WebmasterWorld Senior Member steveb us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 33386 posted 7:19 am on Mar 7, 2006 (gmt 0)

"but I had pages go supplemental last week that are back today."

As stated above several times, that means nothing. Your current listings may be "back" but the supplementals are still there too. You can't get rid of them (minor note, there are two sets of supplementals on different datacenter groups, so depending on what datacenter you hit you could see a different batch).

We aren't talking about different things, although some people don't understand the issue. Regardless of what happened the other day, you now have supplemental listings, and that is poisonous (often only mildly so though). Getting freshly crawled pages back does nothing at all to eliminate the supplemental listings you now have. (In fact, getting freshly crawled pages back is generally easy, but it is basically irrelevant to the problem.)
==

<g1smd, I'd have to think about it some more some other time>

Whitey

WebmasterWorld Senior Member whitey us a WebmasterWorld Top Contributor of All Time 5+ Year Member



 
Msg#: 33386 posted 7:46 am on Mar 7, 2006 (gmt 0)

Sorry to sound simple, but am i right to say that Google's supplementary index isn't working as it should do, looking through these threads!

Why would pages with robots.txt or 301's on them still show in the results ...ya ...de ....dah...etc ....

I can see the same with some of our old sites.

Since this is an overall quality issue for users [ at least ] wouldn't it be wise of Google to come clean and tell us what's going on and when/if they are going to fix it?

arubicus

10+ Year Member



 
Msg#: 33386 posted 8:00 am on Mar 7, 2006 (gmt 0)

Steveb

You know sometimes your posts just ticks me off. Maybe it is because there is some, rather alot of, truth behind what you say? Maybe I just wish you weren't correct.

From my experience even when seeing stuff get refreshed it seems once these pages go supplemental they just keep coming back to haunt us. You know I don't understand it. I don't understand why old deleted pages keep showing up 1-2-3 years later. I just wish the crap would stop and things can just get back to normal.

Enough is enough already! I just wish sometimes that a bunch of huge mega branded sites would take a huge hit sometime and maybe put a little press and investor pressure on to get a stable fix going. Then again if a huge mega site gets hit, at least there site would get fixed in no time flat. As for us small players...we are out in the cold except for a few prayers and luck.

I believe deep down that G is trying to fix all of this. IT just seems that this is taking way to long. Maybe it is just too sophisticated for a quick fix. Heck if I know. For many of us we have been in supplemental hell for at least 6 months to a year.

Dayo_UK

10+ Year Member



 
Msg#: 33386 posted 9:22 am on Mar 7, 2006 (gmt 0)

Hmmmz

People still getting confused.

SteveB saying that is a parallel index is fair comment - but people still assuming that these pages are bad/banned/duplicate just as they got a supplimental tag are wrong IMO.

These pages have been crawled and cached in the supplimental index but not been crawled or cached in the normal index.

The reason for not being crawled or cached in the normal index can be for a number of reasons - depth of crawl, lacks of backlinks etc.

This is why a lot of supplimentals are junk as they are found by what seems a more extensive (and possibly flawed) crawl than the normal crawl - eg you get the // and the non-www still appearing, the normal crawl may not fetch these pages - therefore you only see them as supplimental results.

The greatest page in the world can go supplimental if all links are removed and if it no longer gets fetched by the normal crawl.

Stick a few links on it and it will get crawled by the normal Googlebot and all will probably seem good again - but that supplimental page will still be there too as has been proved by tests by G1smd.

The question that people should be asking themselves is why Google are now not listing there pages in the normal crawl as theses have disappeared rather than the pages going supplimental (as a supplimental copy was probably already there).

The site is not entirely put on suplemental, it's rather, that ONLY the suplemental are still in the index.

Bingo!

Ellio

5+ Year Member



 
Msg#: 33386 posted 10:02 am on Mar 7, 2006 (gmt 0)

Dayo,

Based on your post above why have the "supplementals" suddenly appeared in the site: searches.

I agree they were probably already there in a seperate index but prior to the loss of all pages other than home page the supplementals DID NOT show in a site: search? In our case anyway.

Ellio

5+ Year Member



 
Msg#: 33386 posted 10:05 am on Mar 7, 2006 (gmt 0)

Ledfish,

"1. We are talking about sites that just a few days ago had all pages go supplemental except the homepage. GG says that Google is aware of the problem and working on it. "

Sorry if I am not keeping up but where did GG post the above?

Thanks

Armi

10+ Year Member



 
Msg#: 33386 posted 10:27 am on Mar 7, 2006 (gmt 0)

@ellio

GG wrote

""I'm fine to deny this, because docids and their size has nothing at all to
do with what people have been describing on this thread. I've been reading
through the feedback, and it backs up the theory that I had before I asked
for feedback.
Based on the specifics everyone has sent (thank you, by the way), I'm pretty
sure what the issue is. I'll check with the crawl/indexing team to be sure
though. Folks don't need to send any more emails unless they really want to.
It may take a week or so to sort this out and be sure, but I do expect these
pages to come back to the main index."

Ellio

5+ Year Member



 
Msg#: 33386 posted 10:29 am on Mar 7, 2006 (gmt 0)

>>>>>>>>>>>>""I'm fine to deny this, because docids and their size has nothing at all to
do with what people have been describing on this thread. I've been reading
through the feedback, and it backs up the theory that I had before I asked
for feedback.
Based on the specifics everyone has sent (thank you, by the way), I'm pretty
sure what the issue is. I'll check with the crawl/indexing team to be sure
though. Folks don't need to send any more emails unless they really want to.
It may take a week or so to sort this out and be sure, but I do expect these
pages to come back to the main index." <<<<<<<<

Was this on Webmaster World or Matt Cutts blog?

Ellio

5+ Year Member



 
Msg#: 33386 posted 10:33 am on Mar 7, 2006 (gmt 0)

incorrect post

angiolo

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 33386 posted 10:39 am on Mar 7, 2006 (gmt 0)

It is on WebmasterWorld.

have a look at GG posts:

[webmasterworld.com...]
[webmasterworld.com...]

Do not send emails anymore to GG....

yves1

10+ Year Member



 
Msg#: 33386 posted 10:43 am on Mar 7, 2006 (gmt 0)

The site is not entirely put on suplemental, it's rather, that ONLY the suplemental are still in the index.

This is exactly what I am seeing (except for the home page).

In the case of one of my sites affected by this problem, the supplemental pages are old URLs that were indexed almost one year ago by my mistake (query string added at the end of the URLs in internal javascript links). These query stringed URLs displayed the exact same content as the "normal" (non query stringed) URLs, and this is probably why they were put in the supplemental index.

Of course the javascript links to the query stringed URLs were removed a long time ago. But it was too late: once URLs are in the supplemental index, there is nothing you can do to remove them.

The weird thing is that up to now, these supplemental pages were not a problem for the site. They existed but the normal (non query stringed) pages were in the index and ranked well on their keywords.

The new thing is that now all the normal pages except the home page are gone from the index. Only supplemental pages are present.

So in my case what I am seeing is just the canonical URL/ duplicate content problem worsening.

Armi

10+ Year Member



 
Msg#: 33386 posted 11:13 am on Mar 7, 2006 (gmt 0)

"incorrect post"

Why?

Dayo_UK

10+ Year Member



 
Msg#: 33386 posted 11:24 am on Mar 7, 2006 (gmt 0)

Ellio

Good point - Google seemed to have thrown more supplimentals into the mix at the same time that the normally crawled pages problem arised.

There was obv some update/change that resulted in this happening - but we need to make sure that people dont think that supplimental pages are replacing normal pages - it is just normal pages disappearing and only supplimental pages left.

Armi - did you send your site to GG?

I think the incorrect post comment was in reference to Ellio question of where GG said it. Probably found the post :)

I still think the root of the issues is the allocation of PR - as said earlier we have about 3/4 sets of PR floating around the DCs - and if <rk> is PR then Big Daddy has a further different calculation.

Grinler

10+ Year Member



 
Msg#: 33386 posted 12:03 pm on Mar 7, 2006 (gmt 0)

The new thing is that now all the normal pages except the home page are gone from the index. Only supplemental pages are present.

Same here... when going through the supplemental pages I see old link styles that I used about a year ago. They deserve to be in supplemental as most do not exist anymore.. The problem is that my sites normally indexed pages are missing from the index altogether like the others here. Before BD, I had 452k pages indexed..now I have 174k all being supplemental. Wheres the rest of my content?

This is obviously a glitch. I think people are spending WAY too much time harping on the supplementals. Forget them. People are saying once they are in the supp index they are not coming out. Fine..looking at them I dont really care about these links anyway. This is prob legit and nothing we can do about it.

The real problem is where did our legit crawled/indexed pages go? Thats the most pressing question...

webspud

5+ Year Member



 
Msg#: 33386 posted 1:23 pm on Mar 7, 2006 (gmt 0)

On Big Daddy servers our site too is now only listed by its home page (and one other) and the rest of the site has gone supplemental.

Dayo_UK said:
>>>... - but we need to make sure that people dont think that supplimental pages are replacing normal pages - it is just normal pages disappearing and only supplimental pages left. <<<

I have just checked this and it is not the case with our site.

On non-BD servers -- 495 pages are listed (which is probably about right).
Of these 18 are supplemental.
Of these 5 no longer exist and return 404 header responses.
10 are listed without the www. (since last autumn we have had a non-www. to www. 301 redirect on place on our .htaccess file).
3 are listed with the www. (so should have been listed ok)

On the Big Daddy servers -- 77 pages are listed.
The home page and one other are ok, all the rest are supplemental.
I haven't cross checked them all, but a random check confirms that pages that were listed ok on the non-BD servers are now listed as supplemental on the BD servers.

Also interesting is that after about the first 19 entries, all the supplemental listings are shown without the www. (even when they had the www. on the non-BD server).

I'm not sure what this shows, but I hope it helps someone fathom out what is going on.

Dayo_UK

10+ Year Member



 
Msg#: 33386 posted 1:33 pm on Mar 7, 2006 (gmt 0)

>>>I haven't cross checked them all, but a random check confirms that pages that were listed ok on the non-BD servers are now listed as supplemental on the BD servers.

Yes, but this is not the case the that supplimental page has replaced the normal page - it is the case that the normal page no longer is in the index so the supplimental page is now visible.

pgillman

5+ Year Member



 
Msg#: 33386 posted 1:58 pm on Mar 7, 2006 (gmt 0)

Dayo_UK, you are correct.

Using the site:domain.com -www.domain.com command on non BD centers will return all of the non www prefixed pages, whereas the simple site command only shows www prefixed pages. The non www pages are there in supplemental prior to BD - we're just not seeing them with the normal site command. My suspicion is that with large sites (>1000 pages), people were not seeing supplementals because Google stops displaying after 1000. Our site counts have been inflated for the last several years.

Current problem in BD appears only to be that index contains just HP and all other pages have been dropped. Appearance of supplementals does not mean pages have been moved from main index to supplemental. All of OUR supplemental entries appear to be related to canonical issues (non www, tracking code, double slashes, etc, with some of our pages appearing 4 or 5 times in the index). These were present prior to our experiencing this current BD problem.

We have written Google about this since May 2005, again in October, and finally in November. A Google engineer wrote us in January 2006 saying they were looking for a permanent solution to fix some of these problems.

webspud

5+ Year Member



 
Msg#: 33386 posted 1:58 pm on Mar 7, 2006 (gmt 0)

>>>Yes, but this is not the case the that supplimental page has replaced the normal page - it is the case that the normal page no longer is in the index so the supplimental page is now visible.
<<<

But that supplemental page was not visible on the Non-BD server, are you saying that there are heaps of supplementals on the Non-BD servers that just do not show?

rookiecrd1

10+ Year Member



 
Msg#: 33386 posted 2:09 pm on Mar 7, 2006 (gmt 0)

This is my exact problem as well. The pages that were once indexed (Probably 300,000 or so) are all gone besides the Homepage, therefore only supplemental pages are showing. These Pages as so old, (11-16 months old) and are from when I used an entirely different forum software (IPB instead of VB). These pages deserve to be in supplementals and I have no problem with it, however where the heck are my 300,000 active live pages now?

Thats the question I think everyone has.

oddsod

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 33386 posted 2:34 pm on Mar 7, 2006 (gmt 0)

Off topic: Has anyone noticed that when using Google for your usual, everyday searches you're getting more and more results which are marked supplemental (it's NOT supplimental)?

If you do this search [google.co.uk], for example, almost 50% of the results on the first page are supplemental. Interestingly, Google does have non-supplemental results on the subsequent pages but is insisting that of the top hundred most relevant results almost 50% are from the supplemental index.

2007 News Flash: Webmasters all strive to get into supplemental index? :)

webspud

5+ Year Member



 
Msg#: 33386 posted 2:35 pm on Mar 7, 2006 (gmt 0)

Using the
site:domain.com -www.domain.com
search on a non-BD server as suggested by pgillman, I now get 23 pages listed.

This does not account for the other 50 odd pages that show as supplementals on the BD servers, so are there other supplementals on non-BD servers that I can't see?

maha

10+ Year Member



 
Msg#: 33386 posted 2:44 pm on Mar 7, 2006 (gmt 0)

Are you guys seeing more and more BD Data center results (with supp pages) now or just me?

I'm seeing Big Daddy results slowly migrating to non-BD data centers. Soon all DC will have the BD results - with all supp. pages... :-(

pgillman

5+ Year Member



 
Msg#: 33386 posted 2:46 pm on Mar 7, 2006 (gmt 0)

Webspud,
My example just shows the www versus non www index entries in supplementals. Another test is to look at one of your other supplemental pages (as shown on BD) with an old cache date (one that differs from your current page in content) and search on a snippet from this page (unique to the old dated page) using the non BD center. That's how we've found the other pages that now appear on the BD center and also appear prior to BD.

This 205 message thread spans 7 pages: < < 205 ( 1 2 [3] 4 5 6 7 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved