Forum Moderators: open
Why am I talking about this? Well, Kalman filters have a knob that blends between how much you believe your model vs. how much you believe each new data point. If you tweak the knob all the way in one direction, you always trust the model and any new input just gets ignored. On the other extreme, you can ignore your current estimates about the state of the world, and only trust each new data point as it comes in. If you set the knob too far in that direction, the object you're trying to model jumps all over the place each time you see even a hint of new info.
Lots of people here are getting more stressed than they need to be--their knobs are turned a little too far toward worrying about the very last thing that happened: "Now my subpage is coming up higher than it should! Okay, now my index page is back and the SERPs look good. Gaaack! Now I'm showing well at DC but the subpage still shows up higher at FI! Too much pressure--I'm going to drink now, and start spamming every FFA I see tomorrow!" :)
If you look around, you'll notice not too many senior members posting here. They chime in every so often, but their knobs are twisted further in the other direction. They know that the index switchover takes a little time to settle, and they have the perspective not to get too worried about things right now, and in general.
I haven't posted much of my take lately, but if I could give advice, it would probably be: don't panic. Here's what I would expect. Probably about one data center per day will get switched to the Esmeralda index. You may see some improvements during the course of the switchover as ingredients get blended in as they're ready. I would expect another round of ingredient-adding after the index is switched over.
So: if you're really into Google-watching as a sport, I would check in once a day to see what data centers have been switched, and maybe to run 2-3 searches. Browse a little while, and then come back the next day. Find something fun to do at night besides poring over every last thing that GoogleGuy (or whoever) posts on WebmasterWorld. You'll feel better, I promise.
This is just my take. You're welcome to ignore it. But I mention it because during this index, I heard about a lot of good and bad searches from webmasters, and the more I dig, the more confident I am that things will turn out well.
Unlike updates of the past, it is simply not useful to analyze the results on any of the data centers yet. What you see today, will probably be gone tomorrow.
While I must say that following the data centers is usually entertaining, this time around is boring and a bit counter-productive. Since new results are showing up on www, it can be profitable to do well for a day or 2, but I'd prefer they just get on with it.
After this is done, there will be plenty to discuss in terms of the huge Google changes. Until then, heavy drinking is reccomended for the weekend. :)
Before I even read your post, peterdaly, that is exactly what I saw and thought. I don't think -ex is even close to the new index to-be, and the duplicate content filter is what it was a few days ago with mydomain.com coming up instead of www.mydomain.com. On -fi, www2, and www3, 4 of 5 of my main kw phrases started showing up with www.mydomain.com now. That's a huge improvement on that issue alone. I can't imagine that the new index is gonna be anything close to -ex. Maybe it's wishful thinking on my part that where my duplicate content filter issue was resolved incredibly (www2, www3, -fi) is also where I think we can best examine the new index.
Edited to add that I'm getting 100 more results from
"allinurl:yourdomainname.com site:www.yourdomain.com"
note to self: May be worth playing around with adding and removing www. from the domain name in the above query...
It really is like waiting for the Godot update. Each new data group that appears has nearly laughable flaws. Still though, from what Google Guy has said, hopefully this crap will disappear after all the crap spreads out to all the datacenters.
Somehow I suspect that a couple years from now that people are going to look back and laugh about the old days when Google put dung on a stick out to the public before (hopefully) fixing it.
"Further along" means nothing. If they add new pages, but don't add pagerank or spam filters, what the heck difference does that make? The junk added will be dropkicked to oblivion as more sauce gets added.
Aspects of what types of sites and things are doing well on a particular datacenter at the moment can be mildly interesting, but "further along" means less than zero.
This process we are seeing involves some steps forward and some steps back. Those steps back make the datacenter "further along", but they don't represent results that are closer to the ones we will see in several days... after the process GoogleGuy has outlined takes place.
The above should be mandatory reading before posting for now.
I generally am guilty of enjoying and participating in much of the update chatter as it USED to be an exciting time. With the SERPS completely changing every day it seems to be a major waste of time...
I think the risk for Google is that if people sign on to their half of the ‘contract’: content-content-content, and it is good enough content for others to link-link-link to them, and they are cute enough to apply basic SEO logic to their design, they expect at least some return on their pro-Google philosophy.
If, in a niche, NONE of their competitors come ANYWHERE close to them on any of these criteria... and they are totally jerked around (eg: go missing on their key term to page 10) by Dominic, Esmeralda, or whatever...
Well... what are they to think? Answers please on a postcard.
Actually, I'll tell you what they will think. They will think that the 'contract' with Google is worthless. That the content based philosophy gets you nowhere. That the only way forward is to find weaknesses in the algo and exploit them.
I don't comment on that at all... I simply point out the logic of where this might be heading.
I don't like it in the least, and I don't like that direction (I actually prefer writing my content to cracking algo’s), but it IS a risk and it is something Google should be taking on board very seriously.
If it is the only route left, everyone is going to take it. The current scenario isn't exactly filling anyone with much confidence that the existing defined 'pro-Google' content approach is going to work in the future (or indeed the present).
The risks therefore are high on all sides. Google may well be in transition in a technological sense. They may be trying to support their philosophy across technical boundaries, which no doubt is extremely challenging.
If that is the case, it is to be supported, as their core values may remain in place, and they may still be attempting to meet their half of the so-called contract.
On the other hand, I may be totally naive and may be being too kind to them (and I am admittedly a long term Google supporter). They may just be taking everyone one of us on here for a ride.
However, I'm still prepared to believe the former... because... perhaps... there can be an acceptable face to capitalism. So much about Google is light years ahead of other organizations I have seen (eg: how many other major SE's have staff on here for example?).
I personally give them the benefit of the doubt for now (but ask them... don’t keep stretching it.... real people who have supported your philosophy and stance are obviously suffering… just take a look at the posts above). It is becoming clear though that the patience of some others is wearing much thinner than mine.
I simply hope for calmer waters in the short term!
I wonder if these have been pointed out before...
1. When you use the operator 'link:domain.com' it will give you exactly the number of backlinks to that URL and not the total number of links for that domain, you will get a different result if you use 'link:www.domain.com'
2. Each URL have a unique count of backlinks and this include URL for www. , .index.html and for every form of URL you have.
3. I think the closest way you can determine the number of backlinks is by using inurl:domain.com -site:domain.com I found this to display a more accurate count of 'total' backlinks for the site.
Cheers
Using that on various datacenters, I believe even more firmly that -fi is closest to the new index. On some brand new sites I've been optimizing, which do have links, -fi is the only datacenter that shows them. That says it is further ahead of the other datacenters.
You can even find more by using inurl: format :) In fact, I discovered a few sites who have hijacked my title(w/ my domain name on it) and descriptions on some of my deep pages. They don't show up if I just use the 'link:' format.
Cheers
WebMistress, at least so far, agree on all counts about the data centers and progress...
Anon27, yes we have consistently observed differences between -fi and the other data centers. At any point in time, to steveb's point, they all seem to vary. But, generally, -fi has continuously looked unique in some respects (meaning the SERP's in some categories), while the others have typically been pretty similar, but not identical. We run a lot of sites, and for many, especially the oldest, we see negligible differences between all the data centers including -fi.
*But* for our newer sites, we see that some are higher in -fi (and have been so for days), while some are loweer in -fi. So if -fi is either ahead of the others or behind, that would tell me something. Then, when all the filters and factors and knobs are twisted, we may be able to identify some of what happened. Again, we were able to do that with Dominic.
steveb, are you assuming that there is no progression or order to the roll out of data and filters? That may be true and I'm very curious to know if it's true or not.
If on the other hand, the process is similar in each data center, so that once the index is in, then they add filter a, then b, etc. ... well knowing that is very helpful.
Also, while things keep changing, and I agree that that will continue, if -fi is 90% done and -dc is 65% done, and they are both following similar update processes, then it is likely that -fi is closer to reflecting where the thing nets out...and if that's true, then I'm feeling a bit smarter about Esmeralda than I was three days ago. Certainly one of the data centers last time was predictive more than the others.
Of course, if all the updating is random, i.e., different filters being added in different orders at different data centers, and -fi has more total work done but the biggest filters aren't in yet -fi yet but they are at other data centers (which is possible) ... then this isn't helping me much. But I'd rather watch and try to learn, than assume that there is nothing to be gained from observing ... it is at times fun too, but that's not my main motive.
Of course. These things aren't all over the place by accident. This -ex data has a lot of poor results that are "fresh" results. Fresh results will always tend to be poor quality because they do not have anything else in the algorithm really ranking them besides their newness. Google doesn't even know how many pages from the parent domain is linking to a "fresh" page. Obviously if 1000 page domain links to a fresh page one single time that is different than if it links to that page 578 times. Google has no way of knowing that immediately.
So, my take is just that -ex has newer data, still without many obvious spam filters, and definitely still without the pagerank addition that GoogleGuy has said will come at some point after all the datacenters are synched.
"Nothing to see here folks, move along, move along..."