|Google June 2003 : Update Esmeralda Part 3|
Continued from: [webmasterworld.com...]
Has anyone here ever heard of a Kalman filter? It's a mathematical way of building a model of the world. The math is pretty complex, but basically you try to build a model of the thing you're trying to represent. When you get a new data point, you update your model's estimate about the state of things.
Why am I talking about this? Well, Kalman filters have a knob that blends between how much you believe your model vs. how much you believe each new data point. If you tweak the knob all the way in one direction, you always trust the model and any new input just gets ignored. On the other extreme, you can ignore your current estimates about the state of the world, and only trust each new data point as it comes in. If you set the knob too far in that direction, the object you're trying to model jumps all over the place each time you see even a hint of new info.
Lots of people here are getting more stressed than they need to be--their knobs are turned a little too far toward worrying about the very last thing that happened: "Now my subpage is coming up higher than it should! Okay, now my index page is back and the SERPs look good. Gaaack! Now I'm showing well at DC but the subpage still shows up higher at FI! Too much pressure--I'm going to drink now, and start spamming every FFA I see tomorrow!" :)
If you look around, you'll notice not too many senior members posting here. They chime in every so often, but their knobs are twisted further in the other direction. They know that the index switchover takes a little time to settle, and they have the perspective not to get too worried about things right now, and in general.
I haven't posted much of my take lately, but if I could give advice, it would probably be: don't panic. Here's what I would expect. Probably about one data center per day will get switched to the Esmeralda index. You may see some improvements during the course of the switchover as ingredients get blended in as they're ready. I would expect another round of ingredient-adding after the index is switched over.
So: if you're really into Google-watching as a sport, I would check in once a day to see what data centers have been switched, and maybe to run 2-3 searches. Browse a little while, and then come back the next day. Find something fun to do at night besides poring over every last thing that GoogleGuy (or whoever) posts on WebmasterWorld. You'll feel better, I promise.
This is just my take. You're welcome to ignore it. But I mention it because during this index, I heard about a lot of good and bad searches from webmasters, and the more I dig, the more confident I am that things will turn out well.
Tried that (jumping in it). But then the datacenters decided it wasn't there any more :-)
Same thing happens to my sites, one minute they are doing well (in top 5) then they disappear (not out of the index but certainly out of the SERPS). This has been happening since update Dominic.
I've also just noticed that some of my sites will do well whilst the others don't and vice versa...sounds like a similar situation to yours.
I am quitely confident that my 2 "love of my life" sites will end up #1 or at least high for their respective (different) main keywords as;
1) They are both #1 in www2 right now, but not www.
2) www2 shows many more "found xxx sites" then www for Site A's search term - so, I presume and hope that www has to catch up with numbers of sites.
3) Strangely, www shows more "found xxx sites" then www2 for my Site B's search term - BUT, when you find my Site B, www has the old cache of it (www2 has the newest). So I hope www has to catch up with it's html and other cache.
Anyway, I have hope. And I have been watching the dance and reckon I have found some lovely guidelines as to what Google want to see and by what balance.
I am also hopeful Google are getting this one right. As the link page link via a hidden css "+" at the bottom of the page sites that compete with my Site A are drowning a bit on www2 right now.
Hey GG, getting back to the whole "knob" thing - you're absolutely right - the google index and update are just too "huge" to monitor 24 hours a day and realistically draw any conclusions from.
I was looking good in FI all day yesterday, today's definitely another day - not so good.
I feel like I've had this mass google hysteria for the last 30 days! Definitely time to chill out. Good post.
did googleguy really mean webmasters playing with their knobs or google?...cause it makes more sense if its the google knob we see being tweaked as we see large fluctautions while data is pulled in over a longer period of time than we are used to and then continuing even after the update...the idea being that it is google and not us playing their knobs....for each small amount of date that gets added a large chnage is seen during the update while the google knob is flat to the board..then google says right that looks good and turns the knob the other way...at that point small amounts of data changes make hardly any difference to the serps and you are stuck where you were...or maybe we just see a settling of changes due to new data and more changes due to algo tweaks.....
I see big changes from -sj to va, If I take my own site Im placed good with keyword keyword, but on va not found at all, very strange, it is a to big different between those to I think and who counts for the most -sj or -va, well who knows.
yes this is very cool - I wonder how long before things will settle then go live? I am already seeing some kind of new index not sure which dc it is?
soapystar, GoogleGuy's "knob" was a metaphor of webmaster behaviour. To read it as a metaphor of "algo tweaks" would be a little too far fetched IMO.
[edited by: Giacomo at 11:51 am (utc) on June 19, 2003]
I think GG meant that the more experienced you become th emore you understand that playing with our knobs on a daily basis can be unhealthy - unless of course you work at Google where there is a large amount of knob tweaking happening on a daily basis in their never ending efforts to achieve perfect SERPS. :)
Hey Zeus -
|I see big changes from -sj to va |
Thought I'd mention:
GG mentioned that -sj is a bit of a 'google sandbox'.
And (this one he didn't mention) -sj isn't a 'live' datacenter, i.e. it's not hooked up to www.google...
(all that subject to change whenever in the future, but I have noticed that things have been like this since Cassandra)
just an fyi
mipapage, thanks, what sever do you look at to see the future rankings before it was www2/www3 then sj, butwhich one do use, maybe fi.
zeus - I think www2 still has the latest index, certainly has the most results found out of all the dc's
Definitely some major changes in SERPs I watch then it was when I went to bed last night. Including one of my sites shooting up to #3 for a key single word search term. Higher than I would have ever expected. And, on another search term the main site has also have hit #1 and #2, and for the right pages. Before only a minor page was at the bottom the page 1 SERP. Things really are dancing more this month than usual.
Zeus - (trying to be super clear below to avoid confusion)
Right now you can look at either of -fi, www2, or www3 for the latest state of the still-as-yet-not-finished-updating serps.
You can also check out datacenters that have the new data (-dc, -zu for example), but for simplicities sake, use the above.
The data from -fi is being copied over to the other datacenters, but this data is still subject to change, as GoogleGuy has mentioned.
Fwiw... -fi, www2, and www3 are all the same place right now.
<<Things really are dancing more this month than usual. >>
Definitely. I wouldn't get too excited about anything seen yet, as there still seem sto be some missing pieces. GG even said expect changes after all the datacenters have the new index.
If -zu is showing the same as fi, is there more to come or is that it?
Has anyone else had their PR change today? Mine went down one since yesterday. I went to a PR2 but I still show up in other peopl's backlinks like it was a 3 or 4.
Sometime last night (after 11pm est) the fi index changed. I just started a topic on it, because there are some odd things with the change, hopefully my post will be approved....
I seem to have, but I dunno how, accidently uploaded a index page from another site then i meant to for a few hours (whilist goggle last refreshed that page) and now its stuck - do you think that's it - or i will google go round again :(
The strange thing is- the cached page is *not* for the title/des tags showing - i thought that this always equalled?
mipapage thanks again, That means -fi www2 the place to look for how the update is going.
zeus, www2 and www3, at least in this update, have been pointing to -fi, so you see identical results. -fi was identified as the site that first got the new index. -fi is one of nine data centers that Google users are exposed to, depending upon their geo location and the load on the various data centers.
the index is being slowly distributed to the other data centers. meanwhile, changes also keep happening at each data center, presumably as more filters and other data are blended in. there is some reason to believe, I guess, that since -fi got the index first, it may continue to be most reflective of how the update is progressing, but I'm not sure about this last point.
and as others have reminded us, GG has indicated that not only might we expect to see the index evolve at each data center until this update is complete, but also, new attributes and/or data may be expected to be stirred in shortly after the update is done...so even after the update is done, it's not *really* done...or is it...hmmm.
strap yourself in, enjoy the ride. it's a brave new world ;-)
"when to expect a "true" deep-crawl next"
thought he already said we might never see it again. Freshbot can do the same job.
Soapy, I realize it will come from deepfreshbot, but even though I've been getting thousands of pages spidered, new links are for the most part, like 90% not been crawled for 2 months. I've spoken to others who have the same experience. New sites may have gotten spidered in that time, but old sites with new content...I haven't heard anyone in that situation who felt there was a deep crawl from fresh or deep bot in that time. I could be wrong...
To recap things for everybody affected by the disappearance of their index pages from the SERPS.
This characteristic first emerged during dominic.
GoogleGuy acknowledged it's existance with this comment in the Q&A:
June 12th Ė Q&A
Q: For many sites, the index page seems to be buried on search terms for which logic determines they should rank highly. Is this a transient feature, like some of the other recent issues, resulting from the changeover to newer data? Or is it due to a more fundamental algorithmic change?
A: I donít think itís a fundamental algorithmic change. I donít recall hearing about any changes would bring about long-term behavior like this. Iím pretty sure that itís more of a transient issue, and I wouldnít be concerned about this.
Esmerelda arrives and this same characteristic is still present, again ackowledged by GoogleGuy
June 15th Ė Esmerelda Thread
I did notice the tendency to return subpages rather than index pages for some queries. I'll check around with a few more folks than I did last time to find out if that characteristic will change over time.
I am not asking for counter-claims that this is not occuring or is due to domain.com or www.domain.com issues, this issue was and still is a major part of Google's SERPS for the past few weeks.
GG - I fully appreciate the time you take to participate in this forum and think that if you were able to shed any more light on this issue, it would ease alot of people's tension around here, on the basis that A> It will sort itself out - so all chill out, or B> These charachteristics are lasting, so let's all move on and not keep re-covering old ground.
I personally feel B is the more likely.
|Soapy, I realize it will come from deepfreshbot, but even though I've been getting thousands of pages spidered, new links are for the most part, like 90% not been crawled for 2 months. |
Deepfreshbot seems to do more of a depth first search spidering, still focusing on new links, as opposed to deepbot's more depth/breadth balanced spidering.
In the process, it misses lots of updated pages that already exist in the index.
I found that deepfreshbot - more like freshdeepbot to me - did spider pages off minimum PR2 pages and added them to the index like freshbot used to. But from pages lower than PR2 it did not follow the links. And hence could not add them to the index.
I think that the PR after Esmeralda will determine which other links get indexed and the auto-index-and-follow threshhold is somewhere around a PR2 for each page.
Then again I might be wrong.
wackmaster thanks, it was a little better to see in the old days wenn we just looked at www2 www3, but ok they still count plus fi and I think you are right that fi is the place where the update first happen with new results.
either you are wrong or my sites are an exception....
lets hope for the best
<<new links are for the most part, like 90% not been crawled for 2 months>>
Backlinks that were placed for one of my websites over two months ago finally showed up in the index last week. They probably would have showed up at the beginning of May if it weren't for that weird update. From my end, things look like they are back to normal.
Here yesterday, gone today!
I think I'll take that walk around the lake now!
Someone call me when it's over!
UK_Web_Guy, I think some people have already mentioned that for the searches where they noticed subpages, index pages are starting to come back again. So I think it's begun to some degree already.
Note: I am not making any promise/statement about the long-term ratio of index pages to subpages. ;) I'm just noted that several people, after waiting for a day or two, notice that things seem more normal or like what they expected.
As pardo said, if you grab a helicopter and get a broader view, you'll feel better and learn more than watching every bump or rise from grass-eye level. I still expect data centers to switch at a one-a-day rate, which is our typical switchover speed (Dominic took a few days longer, as some people recall. :)
[edited by: GoogleGuy at 3:51 pm (utc) on June 19, 2003]