Welcome to WebmasterWorld Guest from 220.127.116.11
The Art of Google Datacenters Watch
Good morning Folks
It isn't only a passion but also discipline to observe, analyze and posting remarks about the DCs in general or specific DCs in particular. And patience and focus is the name of the game.
Mostly we are seeking predictions about how tomorrow serps might look like.
And as you might have noticed, watching Google datacenters is a very educating process. Take a look at some posts on this thread and you shall see important topics as canonical issues, supplemental issues, 301 redirect etc. explained in details.
In fact this thread reflects the huge high quality resources this great WebmasterWorld community has.
Keep those great observations, analysis and remarks coming ;-)
[edited by: tedster at 6:55 am (utc) on Jan. 23, 2006]
You are seeing exactly what my take is on the current results I watch..
>>> I see an awful lot of pages that do not belong there. However, when you get beyond the pages that do not belong there, I see th ones that do.
Exactly. Granted the first page or two contains big guns, but then this is followed by unrelated results / spam and then 6 or 7 pages down you will see results that should be first page results.
It puzzles me. But here is my theory... Google wants to push crap to the top in order to receive spam reports that will help them clean up the index. However at the same time they want to keep users happy enough that they do not become frustrated with crap results, therefore the big guns on the first / second pages.
Big guns -> Spam -> Relevant results
The number of results returned in that search has been 110 million for a long time, inflated to 200 million late last week, went back to 120 million in recent days, and now shows as 277 million in some DCS (205 in others). What is going on?
The cache is still out at 18.104.22.168 at least. It is back in many other DCs where it was missing yesterday.
>>But here is my theory... Google wants to push crap to the top in order to receive spam reports that will help them clean up the index.
I agree with this. It makes more sense for Google to let the spam and junk float up because then they will know if their right in their assumption as to what is and what is not spam. Having a group of people submit spam and dissatisfied reports is what I beleive to be a way of confirming algo changes and index updates.
However, it seems tha you and I are the only people who has posted with regards to the December 27 datarefresh. Which in my opinion may have been updated because they wanted to see how big daddy fished out the spam etc... with a much larger index.
I personally think that it would benefit everyone that has anything to do with google regardless if you are a webmaster or a searcher to have that spam go to page 2, and let people that do provide feedback, know that is where it is at and to please submit it there.
Fact is, not many people will even bother to go to page 2 on a search, unless they are hunting for something.
By having the spam on page one, Google may be inadvertantly be costing some companies thousands and thousands of dollars on a daily, weekly or monthly basis.
In my instances, i only do organic seo and i depend on google for natural revenue. We do have ppc and that has been covering the difference for us during this frustrating time frame, but the simple fact is, if its not organically ranking in googgle soon, my heads gonna roll.
Dont get me wrong though, if i am doing something wrong, I will admit to it, and also recognize it. However, since bd and the dc refresh prior, there is something else going on beyond my control.
Lol - this has been anything but a quick process.
Although I do get your point about the data - I hope that a big crawl/data refresh and PR update follow the roll out relatively quickly.
300m & frakilk
I think what you are seeing is that sites which have been effected by Google problems (Hijack, Canonical etc) seem to have a downranking effect - this downranking does not really effect the big big players (kelkoo, amazon etc) in the same way.
So you have serps which show:-
- Big Players
- Decent results (Downranked due to problems)
So I think it is not the spam being promoted up as such.
I am not 100% in agreement with that. I say this because there is a lot of spam on page one.
For one particular term and the #1 result, i go to msn and look at the link opertaor results and what do I see? I see about 20,000 anchor text links from irrelevant fourms. The term is not anywhere on the page that is doing this either. Unless i am not understanding the concept of fourm spamming, this is a clear cut case of spam.
However, I can somewhat agree with the rest of what you said, but as far as spam floating to the top, in the field i am looking at, it is apparent.
Also keep in mind that for me, this did not start until December 27th. Prior to that, i saw the same results on page one on the small index dc's and the inflated index dc's. When i say same results i mean the results that have historically been there for almost a year.
[edited by: 300m at 3:19 pm (utc) on Jan. 25, 2006]
I'm seeing some significant fluctuations as well, one of my tools produced the graph at the URL below for a key phrase for one of my sites across a large number of datacenters including the two big daddy IPs mentioned on Matt Cutts blog.
However, even with the spreading results, my quick test of traffic levels, Adsense Stats, have not shown any increases. They are sending out payments though, which seems to cause a stat lag.
Yes but look how old the cache dates are. I see some as old as last june.
Not an update yet. Have a feeling this is stage two of the new infrastructure with phase three coming with the bigdaddy spreading to all data centers with a deep crawl following that and then....the update.
I think we will know it happens when caches update first.
I also note that URLs further from the homepage don't appear with as much detail as before ... and that my rankings are lower than usual.
Strangely, my traffic is actually up, and my analytics show it coming from the same mix as usual.
I'm guessing these are also symptoms of BD changes ... but as the BD DC's differ from eachother, and the G SERPs fluctuate from showing BD to not, it's rather hard to be sure.
When y'all say definitively that "BD is showing on [insert location] right now [insert specific time]" ... how do you know? I'm new to the datacenter-watch habit, and for now it's over my head (unless you're referring to similar things as I am).