homepage Welcome to WebmasterWorld Guest from 54.204.64.152
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 265 message thread spans 9 pages: < < 265 ( 1 2 3 4 5 6 7 [8] 9 > >     
Big Daddy Part 5
GoogleGuy




msg:705709
 5:43 pm on Mar 13, 2006 (gmt 0)

continued from:
[webmasterworld.com...]


Okay, quite a few people should see their pages coming
back. If you haven't seen any change (that is, if your pages are still supplemental), I'd like to look into that too so that I can see if there's any common factor remaining.

So: if your pages are still supplemental, feel free to write to sesnyc06 [at] gmail.com with the subject line of "stillsupplemental" (all one word), and I'll ask someone to check the emails out.

Hope that helps, and I'm glad that lots of people are seeing a full recovery,
GoogleGuy

[edited by: Brett_Tabke at 5:20 pm (utc) on Mar. 22, 2006]

 

texasville




msg:705919
 7:25 am on Mar 23, 2006 (gmt 0)

Has anyone else seen this? Checking my referrer logs, I look at the ones from google and the snippet is my meta keywords tag. I have not seen this before. An anomaly?

Liane




msg:705920
 7:32 am on Mar 23, 2006 (gmt 0)

Yes, I saw it once in my logs (though it wasn't my keywords tag) and I can only assume it was the company who decided to steal my "Description" metatag. Several weeks after the search appeared in my logs, my description tag showed up on a competiotors home page as the first line of text! Very annjoying to say the least.

hvacdirect




msg:705921
 8:08 am on Mar 23, 2006 (gmt 0)

See Matt Cutt's latest blog entry..you all know where...down to 1-2 data centers according to him, plus assertation that its only a software upgrade.

ScottD




msg:705922
 8:21 am on Mar 23, 2006 (gmt 0)

Bigdaddy is a software upgrade to Google’s infrastructure that provides the framework for a lot of improvements to core search quality in the coming months

I understand that as saying that SERPs changes will be introduced on BD in the short term, and seeing our positions lately, I think its fair to infer that changes are already incorporated.

I think this is going to be the longest update saga in the history of long update sagas.

Liane




msg:705923
 8:23 am on Mar 23, 2006 (gmt 0)

I think this is going to be the longest update saga in the history of long update sagas.

LOL! I think it already is ... isn't it?

Ellio




msg:705924
 10:27 am on Mar 23, 2006 (gmt 0)

Matt Cutts has just posted a new thread dedicated to the "Gone Supplemental" issue.

http:// www.mattcutts. com/blog/gone-supplemental/ (spaces added)

Some site owners over at WebmasterWorld have been discussing an issue where on Bigdaddy data centers, the site wouldn’t be crawled as much in the main index. That would result in Google showing more pages from the supplemental results for that site. GoogleGuy requested feedback with concrete details, and several people responded with enough details that we identified and changed a threshold in Bigdaddy to crawl more pages from those sites.

I checked in that email queue tonight to see how the “gonesupplemental” feedback looked. I looked at an emergency responder site, a truck site, a ticket site, a karate site, a silver site, a T-shirt site, a site about memory, a site selling a type of document, a boating site, and a jewelry site. All were getting more pages crawled, and I expect over time that we’ll crawl more pages from these sites and similar sites that people mentioned. The biggest site that I saw had 711K pages reported, and I saw other sites with 40,400 estimated pages and 52,700 estimated pages for a site: search.

So the upshot is that if you’re one of these people who was paying attention to this issue, I think it has already improved quite a bit, and I would expect to see more pages indexed in the coming week or two. Some sites may see improvements earlier than others because of where a site happens to be in Google’s crawl cycle.


Porter5Forces




msg:705925
 10:44 am on Mar 23, 2006 (gmt 0)

Does that means a 100% crawl is needed before the index pages can be found in the serps?
In that's the case, a large site may suffer and smaller sites can see the results in serps earlier.

broker_boy




msg:705926
 10:50 am on Mar 23, 2006 (gmt 0)

Were getting crawled by both 2.1 and moz bot but the pages aren't being added at all

Go figure

BB

HiltonHead




msg:705927
 2:29 pm on Mar 23, 2006 (gmt 0)

Matt Cutts has just posted a new thread dedicated to the "Gone Supplemental" issue.

Ok, so more are getting crawled but why did millions "Gone Supplemental" in the first place?
What is the glitch and has it been eliminated? What's up with the Supplemental index anyway?
Google should fully index sites or delete them, not throw them into some cesspool assuring they
will not be accessed by searchers.
Matt's thread is appreciated but it is half a loaf.

frakilk




msg:705928
 2:41 pm on Mar 23, 2006 (gmt 0)

>>> GoogleGuy requested feedback with concrete details, and several people responded with enough details that we identified and changed a threshold in Bigdaddy to crawl more pages from those sites.

Judging by this it was a wrongly set threshold that was causing the problem. But that could indeed be spin.

gford




msg:705929
 3:45 pm on Mar 23, 2006 (gmt 0)

Yeah I still have two in SH - emailed googleguy as requested. See what happens I guess.

Odd part is both are being crawled 1000 hits a day and the sites arent that big (< 10,000 for the big one and <2000 for the small one).

cgchris99




msg:705930
 4:02 pm on Mar 23, 2006 (gmt 0)

Is there a way to tell how many pages are supplemental and how many are regular indexed?

Is there a google search we can do to tell this?

g1smd




msg:705931
 4:09 pm on Mar 23, 2006 (gmt 0)

You need to try these three searches:

site:domain.com
site:domain.com -inurl:www
site:www.domain.com

to see which pages are listed as www and which are listed as non-www first.

Add &num=100 to the end of the Google search URL to get 100 results per page.

Next check how far you get before you see the Repeat search to show omitted results message, as that is giving you a clue that all pages NOT shown are classed as duplicates (just having the same title and/or meta description on multiple pages is enough for them to be hidden in this search).

Finally, click that link and look at all the listings. At 100 results per page, you'll only have 10 pages to look at for each search.

Of course, you need the site to have less than 1000 pages for this to be useful.

If the site is larger than that, then add folder names to the search to restrict it, like site:www.domain.com/widgets/ or exclusions using "-inurl", like site:www.domain.com -inurl:widgets.

soapystar




msg:705932
 4:29 pm on Mar 23, 2006 (gmt 0)

if you add &num =100..it turns some sites from single listed to indented..did it always do that?

avalanche101




msg:705933
 4:48 pm on Mar 23, 2006 (gmt 0)

Hi,
Okay I entered:
site:website.com -inurl:www
and got a lit of a 104 pages from our site with the supplemental page tag. Thing is they are all very different to each other.

Our site's droped in ranking from page 1 to between page 5 to 10, is it because of this supplemental results thingy-me-bob?

ClintFC




msg:705934
 2:30 pm on Mar 23, 2006 (gmt 0)

It seems to me that there is a fundamental bug somewhere in the new Big Daddy infrastructure. For some sites this results in huge numbers of pages being erroneously categorized as supplemental, while for others it results in huge numbers of pages not being indexed at all.

It also seems to me that, thus far, Google remains entirely oblivious to these problems. None of Matt Cutt's feedback, for example, demonstrates any grasp of the actual problems. Instead they seem to go off on a tangent, assuming that the problem is simply a matter of sites not being crawled for example.

From all that I've read, and based on my personal experience, my guess is as follows:

1. All Big Daddy datacentres were kick-started from a December 2005 index.

2. Any pages that pre-date January 2006 are likely to have survived Big Daddy, although many have now been erroneously marked as "supplemental".

3. Any pages that are new since December 2005 are being completely ignored. The crawlers crawl them, but they fail to make it into the index at all.

4. These problems are present on ALL BD datacentres and absent on all non-BD datacentres. The bug is therefore somewhere in the Big Daddy code.

5. Nothing so far has managed to bring these serious issues to the attention of either Google, or the press. Unless this happens, this could theoretically go on for ever.

Armi




msg:705935
 5:18 pm on Mar 23, 2006 (gmt 0)

Matt Cutts wrote:

"pgaz, I’m not trying to minimize that this affects people. But some of this happens in the crawl/index cycle and I can’t force that to run differently. T2DMan, that was the best estimate I would have made at the time. There were some things about the Bigdaddy crawl/index cycle that I wasn’t aware of that made it take longer. I was in a meeting yesterday and re-emphasized that I thought it was important to get more pages from those sites crawled as soon as we could, because I know that this is stressful for the webmasters who were affected.

Steve, the Mozilla Bot is what fetches pages for the Bigdaddy data centers."

Dayo_UK




msg:705936
 5:21 pm on Mar 23, 2006 (gmt 0)

>>>> the Mozilla Bot is what fetches pages for the Bigdaddy data centers

And therefore Big Daddy has a different calculation of BL/PR etc.

Has this been applied to the serps ranking though - I wonder.

[edited by: Dayo_UK at 5:22 pm (utc) on Mar. 23, 2006]

lesjkajski




msg:705937
 5:22 pm on Mar 23, 2006 (gmt 0)

3. Any pages that are new since December 2005 are being completely ignored. The crawlers crawl them, but they fail to make it into the index at all.

This is not true, i started after december and pages are indexed but dropped away to the bottom of SERP's since 8-3

4. These problems are present on ALL BD datacentres and absent on all non-BD datacentres. The bug is therefore somewhere in the Big Daddy code.

Rankings lost on all DC

5. Nothing so far has managed to bring these serious issues to the attention of either Google, or the press. Unless this happens, this could theoretically go on for ever.

Very true, Matt don't realize what is going on.
I heard today from a lot off other people that when they search they have to dig deeper in SERP's to find what they are looking for.(dutch and English results)
This people don't know SEO and just use the SE, they also said that this was not before the case.

Google is going to loose marketshare(and we loose income) if they don't take quick action and bring webmasters the appropiate info

marketingmagic




msg:705938
 5:35 pm on Mar 23, 2006 (gmt 0)

Where is Google pulling titles from lately?

We're ranked well for a particular KW but where they are pulling the title from has me completely stumped!

Normally its the META title, then the description is either from DMOZ, or the META, or from content.

In this case we don't even have the same words on the page thats listed, nor is it in the META tags.

Anyone have any ideas on where else they might be pulling this title from?

dramstore




msg:705939
 5:37 pm on Mar 23, 2006 (gmt 0)

"3. Any pages that are new since December 2005 are being completely ignored. The crawlers crawl them, but they fail to make it into the index at all. "

This seems to be exactly what's happened to me, thousands of pages since December, seems like none or very few in the index now, although heavy crawling.

Also the keywords for referals nowadays is very similar to pre December.

jrs_66




msg:705940
 5:41 pm on Mar 23, 2006 (gmt 0)

After reading this thread for days, i have to through in my two cents...

In my opinion the BD index is at least as good as it was prior to BD... I don't think the 'press' is going to write about the collapse of Google any time soon.

Also, I have, to date, seen many supplemental pages in my google travels... I can't think of a single one that I thought was put there wrongfully... In my opinion it would serve Google well to send MANY other auto generated 'filler' pages out to pasture.

Has it crossed anyone's mind that maybe this is just another, quite intentional, step in Google algorithmic evolution? Could it be time to reevalute your strategy too?

I sincerly hope the days of the 500,000 page two month old site is gone for good...

ClintFC




msg:705941
 5:55 pm on Mar 23, 2006 (gmt 0)

jrs_66

Hmmm. Nice sentiments, but somewhat wide of the mark I'm afraid. According to Google's own statements BD contains no algorithmic tweaks. Instead it is a fundamental change to their infrastructure, new bot, new indexes, etc. Trouble is, their new bot and/or index contains a serious bug that indiscriminately discards perfectly good pages (possible even all pages that are new since January-ish).

Yes this may well get rid of some Spam, but it will also get rid of all of the quailty content too. I guess you are just lucky that you aren't effefcted (yet). Maybe you should just count your blessings instead if coming over all smug?

Pico_Train




msg:705942
 5:57 pm on Mar 23, 2006 (gmt 0)

g1smd

Empty your inbox!

Trying to send you a sticky and it is full!

obviously very popular, I wish I was popular.

g1smd




msg:705943
 6:05 pm on Mar 23, 2006 (gmt 0)

Yeah, I get a lot of stuff.... all "301 redirect" questions....

Ellio




msg:705944
 6:18 pm on Mar 23, 2006 (gmt 0)

We have lost 429 pages out of 430 that are all pre December 2005 ( mostly years old) so I cannot verify that theory.

soapystar




msg:705945
 6:30 pm on Mar 23, 2006 (gmt 0)

<edit>

g1smd




msg:705946
 6:34 pm on Mar 23, 2006 (gmt 0)

<answered question that has now gone>

wheelie34




msg:705947
 6:38 pm on Mar 23, 2006 (gmt 0)

I have lost about 30% of my pages, when I run site: I go through the results BUT I see pages that were old and new, I ran the saturation tool today to see which IP still had the full count it was still on 216.239.59.99 so thats a non BD IP? and did site: on UK google then went through page by page to see whats missing and what isnt, there are folders IN BD that were added after new year and some from january 05 seem to be missing so the date thing doesnt work for me, old and new is gone?

Will they return? has anybodys page count started to increase?, mine has gone up by around 5 pages which isnt even 0.5% of whats missing

g1smd




msg:705948
 6:52 pm on Mar 23, 2006 (gmt 0)

Even the WebmasterWorld, SEF and IHU forums are suffering from decreased page counts in the index.

I just looked at a search that has been returning about 680 results from just those forums alone, and now it returns only about 250 results.

[google.com...]

jrs_66




msg:705949
 7:15 pm on Mar 23, 2006 (gmt 0)

--- According to Google's own statements BD contains no algorithmic tweaks.

I believe they said no 'major' changes, I don't think most people see any 'major' changes... if any at all

--- Instead it is a fundamental change to their infrastructure, new bot, new indexes, etc.

I guess they would create a new bot and build a new index to have the same results?!?!?!

---(possible even all pages that are new since January-ish).

You've got to be kidding... 80% of my site has been built since then... all there!

---Yes this may well get rid of some Spam, but it will also get rid of all of the quailty content too.

I suppose everyone should expect to see '0 results found' pretty soon! That would be a bad business move...

---I guess you are just lucky that you aren't effefcted (yet). Maybe you should just count your blessings instead if coming over all smug?

I just try to build quality, useful websites... a strategy that's worked for me for almost 10 years...

This 265 message thread spans 9 pages: < < 265 ( 1 2 3 4 5 6 7 [8] 9 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved