Forum Moderators: Robert Charlton & goodroi
Now we have the new BigDaddy infrastructure rolled-out, its about time to make some assessments (could be damage assessments for some publishers).
Though Google data centers are in everflux (or possible update of some kind), signs of improvement or deterioration should be becoming to some extent clear.
To judge whether the new infrastructure is a success or a failure, we might use the following items as a check-list and see whether they are met yet or expected to be be met very soon:
- improve search quality (otherwise why should it be there at all)
- reduce spam to large extent
- fix the canonical issue
- possibly deal with supplemental issue
- more correct indexing of sites (again.. otherwise why should the new infrastructure be there at all)
Thanks for feedback.
One can expect that it will take a lot more time for the tools to be set up and refined. Their implementation plan is their business secret and we are surely not in any position to judge how they stand on that plan.
I’m sure eventually it will play some sort of role, for some aspect of getting a site to rank well organically in Google; but so far it hasn’t affected us so were not in any position to comment on whether it has performed as designed.
I understand that this data refresh is'nt Big Daddy but Google really put alot of honest white hat webmasters out of business in my sector.
Big Daddy may not be a failure in the long run but the preparation for it is.
Pure Speculation on my part, This change at Google could have been an upgrade of the servers, or the OS, or the OS running on the servers, or a combination of the two. It could be that G upgraded their boxes, and the OS (operating system), to be able to take advantage of the processing capabilities of today’s processors.
I personally have not seen much difference before and after BD in my niche, I have noticed a slowdown of G-bot visits & indexing on my sites, IMO this is a function of the g-bot, and not the result of the infrastructure change(unless they classify G-bot as part of the infrastructure feeding the servers that run the algos).
I have read MANY complaints of pages being dropped from the index, but People have complained of this issue for quite some time, it just seams a bit more this at this time.
I do not believe that the BD infrastructure change itself was to fix any of the problems listed in the opening comment, but was implemented to create a platform that in the future would allow for changes to applications for addressing some of the problems listed.
Back to Watching
WW_Watcher
"The opening post in this thread it a classic example of the lack of understanding of what infrastructure is as compared to the tools that will be able to be used on that infrastructure."
No lack of understanding, I guess :-)
In fact the items in the check-list in my first post are just about the same as our good friend at the plex, Matt Cutts, included in his post on 4th January 2006 asking for feedback. Matt did also a great job in explaining few aspects of the new infrastructure, when he asked about "Feedback on Bigdaddy data center" already on 4th January 2006. Matt had also mentioned in later related posts more about the new infrastructure, of course.
After around 3 months of the said date, it wouldn't be wrong at all to do some assessment in an effort to see at least whether there are signs that things are moving toward achieving the new infrastructure declared goals.
Please don't get me wrong. I'm not inviting for negative feedback, rather some assessments or signs of improvement or deterioration.
============================================
Feedback on Bigdaddy data center
January 4, 2006 @ 12:44 pm · Filed under Google/SEO
=======================================
........
Q: What’s new and different in Bigdaddy?
A: It has some new infrastructure, not just better algorithms or different data. Most of the changes are under the hood, enough so that an average user might not even notice any difference in this iteration.
Q: I noticed some ranking changes across all data centers. Was that Bigdaddy?
A: Probably not. There was a completely unrelated data refresh that went live at every data center on December 27th. Bigdaddy is only live at 66.249.93.104 and 64.233.179.104 right now.
Q: Is there specific types of feedback that you want?
A: We’d like to get general quality feedback. For example, this data center lays the groundwork for better canonicalization, although most of that will follow down the road. But some improvements are already visible with site: searches. The site: operator now returns more intuitive results (this is actually live at all data centers now).
Q: What else can you tell me about Bigdaddy?
A: In my opinion, this data center improves in several of the ways that you’d measure a search engine. But for now, the main feedback we’re looking for is just general quality and canonicalization.
Okay, now let’s get to the meat of this post: how to give us feedback on Bigdaddy. I’d be delighted to get webspam feedback, but I’m most interested in hearing feedback about canonicalization, redirects, duplicate urls, www vs. non-www, and similar issues. Before you send in a report, please read my previous posts on url canonicalization, the inurl operator, and 302 redirects. Now here’s where to send feedback:
......
[mattcutts.com...]
=========================
lays the groundwork for better canonicalization, although most of that will follow down the road.
Not seeing that, yet. Further on down the road, I guess.
The site: operator now returns more intuitive results (this is actually live at all data centers now).
Not sure what Matt was pointing to here. Page counts are still inflated many times -- and I can't find duplicate url problems in every case, either. Supplemental results show in the first iteration, without clicking on "omitted results" which is what I would expect (though I'm usually happy for the hint this offers).
For the record, I just wish to list here some of the other recent threads which might reflect the success or failure of the progress of the new infrastructure so far.
Search for discontinued product - 50% bad pages
[webmasterworld.com...]
Deteriorating google Search
[webmasterworld.com...]
Googlebot isn't crawling
[webmasterworld.com...]
Pages dropping out of the index - in two months time it will be 0
[webmasterworld.com...]
Pages Dropping Out of Big Daddy Index
[webmasterworld.com...]
Very Odd - Big Drops in SERPS Today April 26, 06
[webmasterworld.com...]
dropped pages returning
[webmasterworld.com...]
Google PR Update - some unexpectedly high/low PR is being reported
[webmasterworld.com...]
Google algo moves away from links, towards traffic patterns
[webmasterworld.com...]
Matt Cutts Confirms Random Google's Serps... Google's Democracy!
[webmasterworld.com...]
I hope this helps.
But if we look at the current number of postings regarding Googles lack of crawling, loosing data or dropping it whatever you want to call it I'd say so far this has been a complete failure, but like most I'm hoping that now that Google has its new toy in place it starts tweaking it a bit and these fine tweaks will hopefully resolve problems within the serps
time will tell if its success or a failure, that time is not now and the way rate things have moved in the past with Google I don't really think we can have that conversation for at least another 3 months - maybe even longer
The problems that come with scale can be enormous.
"how can we discuss if this has been a success or failure, when in theory its still ongoing."
In practice and according to Matt, the switchover to BigDaddy new infrastructure should be over around the end of last month; March. Accordingly, we could as well only talk: Google new infrastructure from now on.
Matt also talked about improvements to core search quality (smarter redirect handling, improved canonicalization, etc.) which are expected to take place on the new infrastructure in the months after the switchover is completed. April should be as such one of those months where we should start witnessing improvements. Therefore this thread.
Btw, any improvement you wish to report ;-)
I truly believe that google is trying everything they can to keep this secret until they can recover from this fiasco.
Agree 100%! why Matt Ignores all posts here and in his own blog complainig about drops? 'cause he know that its a Google mistake.
BigDaddy was a BigMistake... I'm sure we have more spam than ever now... SERPs are really crazy!
Matt is but one small cog in a big wheel, and as much as he may want to share more info, he is restricted by the Google machine.
End result, as in any similar siruation, is that people speculate, place blame, and make assumptions, becuase of their frustrations in not getting anywhere with the facts.
As for the success or failure of BD, who knows - we certainly don't - as it is too early to say. But, yes, people see ealy changes that they don't like, and it's a failure. Others see early changes that they like, and it's a success. One thing is for sure - the world is round!
lays the groundwork for better canonicalization, although most of that will follow down the road.Not seeing that, yet. Further on down the road, I guess.
Very very long road it seems.
You are right. Matt has been silent about reporting progress of the new infrastructure.
He is just playing it cool I guess.
I think he probably feels that if he lets certain things out and they dont materialize in quick order then it is probably best to keep quiet until there is something concrete to comment on.
As for my little niche e.g. the words "netto" and "brutto" still rank fairly high in my google-sitemaps-site-analysis. Since these words are quite common on a B2B commercial site, I think there's still a lot to do for google on that field.
I think it´s to early to say anything jet.
In my niche there have been wipped out some of the spammers that realy nerved. But it hit although some of the good sites that maybe be detected as spammers but aren´t spammers. There should be come some improvements in the next time.
2. Big Daddy rised most of the price-comparison side to the to. Very annoying.
3. Most of the pages of the sides resides in the supp hell.
4. Many Dupplicate Content pages are found to be in serps under Top Twenty.
5. G* has thousands of 404 pages listed for my domains. Even they were deleted 2 years ago.
6. Big Daddy Serps are beeing taken in keyword stuffing.
Result: I think there is a light on the end of this update-tunnel. But there have to be some improvements. Mostly in the case that g* should look at white hat sides that have been punished to heavy. Maybe it´s tweaking time for them now. In the moment yahoo and msn have their noses a little bit ahead of google. We will see how long this will last.
At least the new sitemap feature. It´s nice but it´s not doing justice to all the webmasters.
OMO LuckyGuy
He is just playing it cool I guess. <<<<<
>>>>The whole aura surrounding Google is one of secrecy and not providing the competition with information, rather like Coca Cola has done such a fine job all these years in keeping the ingredients to their drink, secret.
Matt is but one small cog in a big wheel, and as much as he may want to share more info, he is restricted by the Google machine.<<<<<
This has nothing to do with trade secrets or reporting progress or non-progress. This has everything to do with protecting stock prices and saving face. Their silence about their search problems is deafening. Being a public company is all about damage control and their history of "updates" and attempts at "spam-control" help cover one of the biggest problems Google has ever had. They don't want the general public to perceive that the giant has fallen. It would cause them to lose ground as the leader and could indeed signal the end of Google's dominance.
Billions of dollars are at stake here. And Google will never admit their problems. Meanwhile, millions of websites are being affected. Some helped and some being seriously damaged. Most don't participate in forums and we will never hear from.
What I would like to see is everyone that does have sites damaged would chime in and say so and report how many sites they manage have been damaged.
Me- three sites-all belonging to clients that have small businesses. All white hat and straight forward sites.
They may have flat lost a major chunk of data and can't get it back. I think they had major meltdown.
Wow. Who would have thought that a multi billion dollar company like Google would be so silly as to change over to a whole new infrastructure without keeping a backup of the older results? ;)
LOL ... maybe they can get an assist from the WayBack Machine!
Sorry, but I disagree with you synopsis. It's all well and good to talk about shareholders, profits etc, but where would Google be if it gave away all its projected ideas and plans?
The truth is that those who have had problems these last few months want to blame them on Google. Why on earth should Google take the blame head on, when we have the choice to either find alternative sources,or get out.
Nope, life isn't fair sometimes, but blaming a 3rd party sure ain't gonna solve it.
This has everything to do with protecting stock prices and saving face....They don't want the general public to perceive that the giant has fallen.
Since when does the general public read SEO forums or the Matt Cutts blog?
The general public judge Google by whether they can find what they're looking for in the SERPs. Whether Google chooses to give a running commentary to SEOs and Webmasters has nothing to do with how users or investors perceive Google.
I think they just need to do some extra spidering, and then run their algo's on the new infrastructure, and it will be all (?) good...
... at least that's what I hope :-)
I love Google and have great respect for both Matt and GG. In that spirit I'm writing this quick comment.
"The general public judge Google by whether they can find what they're looking for in the SERPs."
Exactly. But if, for example, the current tendency of deindexing big portions of sites or dropping entire sites, for no obvious reasons continue. Would the public still have the same choice of finding diversity in what they are looking for. Would the public benefit of sites where only the homepages or very small portions of them are indexed?
Allow me to illustrate on that one. Imagine a library where several of its books are standing there only containing the cover or few pages of each. How do you classify the quality of such library?
Imagine a library where either it changes its index and classifications of books each day or just displays books on shelves in random. How do you classify the quality of such library?
Granted, it's true that some searches don't yield many results at the best of times, and the user who's searching on "McGowan's Eyelid Syndrome" (an extremely rare, fictitious disease) needs every result he can get. But if he doesn't find a comprehensive Mayo Clinic article on McGowan's Eyelid Syndrome in a Google search because it's missing from the index, he'll probably assume there's a shortage of informative Web pages about McGowan's Eyelid Syndrome--not that Google has screwed up and dropped Mayo Clinic's pages about eyelid diseases from the index.