Forum Moderators: open

Message Too Old, No Replies

Google June 2003 : Update Esmeralda Part 3

         

GoogleGuy

7:15 am on Jun 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Continued from: [webmasterworld.com...]


Has anyone here ever heard of a Kalman filter? It's a mathematical way of building a model of the world. The math is pretty complex, but basically you try to build a model of the thing you're trying to represent. When you get a new data point, you update your model's estimate about the state of things.

Why am I talking about this? Well, Kalman filters have a knob that blends between how much you believe your model vs. how much you believe each new data point. If you tweak the knob all the way in one direction, you always trust the model and any new input just gets ignored. On the other extreme, you can ignore your current estimates about the state of the world, and only trust each new data point as it comes in. If you set the knob too far in that direction, the object you're trying to model jumps all over the place each time you see even a hint of new info.

Lots of people here are getting more stressed than they need to be--their knobs are turned a little too far toward worrying about the very last thing that happened: "Now my subpage is coming up higher than it should! Okay, now my index page is back and the SERPs look good. Gaaack! Now I'm showing well at DC but the subpage still shows up higher at FI! Too much pressure--I'm going to drink now, and start spamming every FFA I see tomorrow!" :)

If you look around, you'll notice not too many senior members posting here. They chime in every so often, but their knobs are twisted further in the other direction. They know that the index switchover takes a little time to settle, and they have the perspective not to get too worried about things right now, and in general.

I haven't posted much of my take lately, but if I could give advice, it would probably be: don't panic. Here's what I would expect. Probably about one data center per day will get switched to the Esmeralda index. You may see some improvements during the course of the switchover as ingredients get blended in as they're ready. I would expect another round of ingredient-adding after the index is switched over.

So: if you're really into Google-watching as a sport, I would check in once a day to see what data centers have been switched, and maybe to run 2-3 searches. Browse a little while, and then come back the next day. Find something fun to do at night besides poring over every last thing that GoogleGuy (or whoever) posts on WebmasterWorld. You'll feel better, I promise.

This is just my take. You're welcome to ignore it. But I mention it because during this index, I heard about a lot of good and bad searches from webmasters, and the more I dig, the more confident I am that things will turn out well.

Kirby

11:03 pm on Jun 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



<<Probably about one data center per day will get switched to the Esmeralda index. You may see some improvements during the course of the switchover as ingredients get blended in as they're ready. I would expect another round of ingredient-adding after the index is switched over.>>

As I read this, it appears that this 'index' will have ingredients added both during and after, so it should continue to fluctuate for sometime. I'm with steveb, this work in progress is ongoing and not worth wasting a weekend watching 9 pots simmer.

illad

11:07 pm on Jun 20, 2003 (gmt 0)



Mfishy:
Here's the quote from GG saying PR is still brewing in the background.

"I just wanted to echo what Brett was saying about PR in flux. I've seen several searches, including a few that Napoleon was kind enough to pass on (thanks Napoleon!), and many of those are still affected by pending PR computation. I wouldn't worry about PR until things settle down a little more; indeed, you might want to wait a few days after people call the index switchover complete before you draw conclusions about what your PR is. Just wanted to chime in with that so that people not to worry too much. Hope it helps people to know that some PR is still stewing in the back. In particular, rfgdxm1: looks like the search you told me about isn't done brewing, and Napoleon: the first search you mentioned to me also looks like it will settle more (both for the better, in my opinion).

Maybe it's pointless to advise people to bear in mind that results probably will change some (based on the digging that I've done today), but hopefully it will ease a few minds, too. :)"

WebMistress

11:17 pm on Jun 20, 2003 (gmt 0)

10+ Year Member



I just popped back to #2 from nowhere'sville for a month on -ex for one keyword phrase. This is exactly what happened when -fi made some tweaks (for the same exact kw phrase, while not for others). I wonder if they took a base...added it to -fi, tweaked it, and our now repeating that pattern with the other data centers one at a time. Seems odd, based on past indexes that smoothly move over as exact copies when they do move over. I have no clue....I want to stop looking and I can't....do they make meds for this?

WebMistress

11:35 pm on Jun 20, 2003 (gmt 0)

10+ Year Member



John22,

I don't know the answer to your question. But I did a little looking and I see that when I looked at the total for a kw phrase at random as follows, I got:

496,000 -ex
561,000 -fi

Add &filter=0
497,000 -ex
570,000 -fi

Based on these results, it doesn't look like -ex is quite as baked as -fi. It would make more sense with adding &filter=0 to make a 9,000 result difference than a 1,000 result difference.

Anon27

12:09 am on Jun 21, 2003 (gmt 0)

10+ Year Member



I do not know, I am not really with everyone else on FI.

FI is still missing www.mydomain.com, but does have mydomain.com and a sub-page is coming up in the SERP's first when the index most certainly should.

EDIT: Actually, none of the data centers have it right, not just FI.

[edited by: Anon27 at 12:22 am (utc) on June 21, 2003]

Critter

12:15 am on Jun 21, 2003 (gmt 0)

10+ Year Member



Sooooo, webmistress, you're possibly warming to the idea of -ex being the newer index?

:)

Peter

Anon27

12:21 am on Jun 21, 2003 (gmt 0)

10+ Year Member



EX is moving over to CW right now. Looks like it might be a fun night.

WebMistress

1:05 am on Jun 21, 2003 (gmt 0)

10+ Year Member



Hey Critter, I hate the idea of -ex being the new index, UNLESS OF COURSE, -ex changes into the same index we're seeing on -fi except better...better enough to makes us all happy campers again! Althouhg -fi is AWESOME for me for 4 out of my 5 targeted kw phrases (as in #1 for all of them), I see other sites of people I know that are way down in SERP's for no good reason, except maybe FEB indexed, and I believe they should and will be pulled BACK up to page one when this is over--and it will be a lovely day if I also shine at some lovely place on page one....doesn't need to be #1...just reasonable.

But from what I saw as -fi brought my site back and what I am seeing happening on -ex, it looks like a similar process, so maybe -ex is just a little slower than -fi and that's ok 'cause I got time...heck, we all, obviously, have time if we're hanging out here...lol

Critter

1:09 am on Jun 21, 2003 (gmt 0)

10+ Year Member



I dunno...

It's funny, because all the datacenters (-va, -dc, -ab, -fi) seemed to get the same index, then -ex got this very different index all at once.

Now I'm seeing this same index, along with the -fi index, showing up in -cw. I think that whatever -cw settles on will be the index-to-be.

Peter

WebMistress

1:15 am on Jun 21, 2003 (gmt 0)

10+ Year Member



Critter,

Explain this:
"Now I'm seeing this same index, along with the -fi index, showing up in -cw. I think that whatever -cw settles on will be the index-to-be."

I don't know what you mean by seeing -fi showing up on -cw...do you mean slowly seeping into what -ex brought over to -cw?

Because I am not seeing any of my -fi results on -cw yet. I'm seeing on -cw right now EXACTLY what I saw on -fi 3 days ago. In fact, if I wanted to whine...I could just copy and paste my posts from 4-5 days ago about -fi and they would be an exact match for what I see on -cw. But I don't like to wear out a good whine! :)

Critter

1:16 am on Jun 21, 2003 (gmt 0)

10+ Year Member



No, I mean that the -ex index showed up on -cw...but -cw is flipping between the index on -ex and on -fi. -sj is doing this as well.

Unless -cw and -sj are both doing simple redirection I would assume one or the other of the index versions that -cw and -sj are showing would settle in to be the proper version.

Peter

rfgdxm1

1:19 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



WebMistress, what happened is that what seemed like a totally new index popped up on -ex, and that has now partially propagated to -sj and -cw. Keep clicking refresh on the latter 2 and you should see -ex results popping up at times. My expectation is that the -ex index will spread to even more datacenters. The open question is whether an even newer index than what is on -ex may pop up on one of the other datacenters?

rfgdxm1

1:23 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The reason you are seeing this Critter is that -sj and -cw isn't one big computer, but instead a large number of PCs running Linux. There is no way of updating every machine instantaneously. Thus, when a datacenter is updating, until it finishes you'll randomly see either the new or old index.

Anon27

1:33 am on Jun 21, 2003 (gmt 0)

10+ Year Member



The open question is whether an even newer index than what is on -ex may pop up on one of the other datacenters?

I know of three sites that were publish for the first time on Thursday, June 12, spidered by Google on Friday, the 13, went live on Google on the 14th and they are in EX.

I hope this helps.

EDIT, Oh, I almost forgot: Thanks Google!

[edited by: Anon27 at 1:45 am (utc) on June 21, 2003]

steveb

2:05 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I see a lot of brand new pages on -ex that are not on -fi. So clearly -ex is more of what we used to call freshbot results. In other words, -ex almost certainly isn't important in terms of the update. All that new junk should be put in its proper place at some point.

Anon27

2:14 am on Jun 21, 2003 (gmt 0)

10+ Year Member



WebMistress:

Yes, they are in all 9 datacenters and generally appear about the same in the SERP's.

rfgdxm1

2:16 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No steveb. I have been looking at a number of SERPs on -ex where there is almost zero freshbot activity. -ex is most definitely a different index than -fi, and this index came later than -fi.

steveb

2:32 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"-ex is most definitely a different index than -fi, and this index came later than -fi."

Um, that is what I said.

The index has pages made in the past couple weeks. It's new material, as Anon27 previously posted.

<edit... new pages, not fresh versions of old pages>

wackmaster

2:56 am on Jun 21, 2003 (gmt 0)



Well this doesn't fit with my hypotheses about how the data centers evolve, but what it *looks* like to me is that the -fi index is now spreading to -sj, -ab, -zu, and -cw.

Remember that -fi was first to fly with the new index too. Perhaps logical that it would be the lead indicator...

Any takers on that? ( Besides WebMistress ;-) )

rfgdxm1

3:12 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No steveb. I meant the -ex index *without* freshbot activity is a later one than -fi.

Net_Wizard

3:48 am on Jun 21, 2003 (gmt 0)



ex and cw are pretty much on the lead with new filters implemented.

va and fi are pretty close to each other and following ex and cw.

dc, sj, ab, and zu the index were pretty new but w/o additional filters.

in <- still holding on to the old index.

How's that? :)

johnnydequino

3:58 am on Jun 21, 2003 (gmt 0)

10+ Year Member



-ex and -cw are short on sites. Maybe these data centers have advanced filtering in place?

jd

kevinpate

4:27 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I thought a few days back GoogleGuy made a reference to -sj (this time aorund anyway) being sorta like a techiesatdaplex sandbox. I took that to mean -sj was a place where they can get all warm and fuzzy and play without worrying too much about the surf crashing down around them.

Oh well, maybe I dreamed it all. It's been a weird week offline as well, so that's certainly possible.

tigger

4:56 am on Jun 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>I thought a few days back GoogleGuy made a reference to -sj (this time aorund anyway) being sorta like a techiesatdaplex

Yes I thought he did too

Net_Wizard

4:58 am on Jun 21, 2003 (gmt 0)



You are missing your www.mydomain.com but you still have mydomain.com? Correct me if I'm wrong but I don't see anything wrong with it. Ideally, Google would only display either the www or the URL without the www but not both.

As to which one should be displayed, that depends on how you structure your site and how others link to your site.

As to ranking, only Google could give you an exact answer. For instance, if your query is a long phrase you might come up #1 but just remove a single word or even a stop word from your query and everything changes however slight that is.

Cheers

Anon27

5:11 am on Jun 21, 2003 (gmt 0)

10+ Year Member



You are missing your www.mydomain.com but you still have mydomain.com? Correct me if I'm wrong but I don't see anything wrong with it.

Problem is, mydomain.com does not have any back links, i.e., no PR. www.mydomain.com does.

And why when I search for my single most important KW the mydomain.com is the site found (way back in the results) but when I do a search for that KW with any other KW, i.e., a KP, the www.mydomain.com is the result I recieve, and in a respectable position?

Your thoughts?

BTW, all links are anchor with the one KW.

It truly is a #2 site, and was for 8 months until the last update.

gmoney

5:21 am on Jun 21, 2003 (gmt 0)

10+ Year Member



Probably about one data center per day will get switched to the Esmeralda index. – GoogleGuy

I haven’t engaged in the new sport of Google-watching but my stats seem to back up the general idea GoogleGuy’s statement. I imagine there are other variables involved but, according to my example below, GoogleGuy's statement seems to describe the big picture of what is going on.

I recently launched a new site around the beginning of May. I never saw the deepbot, but the freshbot grabbed every page around the end of May. A number of hours after Brett’s post indicating the start of Esmeralda [webmasterworld.com] I noticed my first couple referrals from Google. In the days since then I have observed the following.

6/16 – I received X referrals from Google/Yahoo/AOL
6/17 – I received 2.8*X referrals from Google/Yahoo/AOL
6/18 – I received 3.5*X referrals from Google/Yahoo/AOL
6/19 – I received 3.8*X referrals from Google/Yahoo/AOL
6/20 – I received 5.2*X referrals from Google/Yahoo/AOL

It kind of seems like Google is adding roughly one of something a day to increase traffic to the site. If there are 9 data centers then perhaps this site will settle at around 9*X referrals a day.

Edit - Thanks Dolemite and Anon27 for correcting my mistake. I changed April back to the correct month of June so as not to confuse anybody. Google isn't good enough to see into the future yet:)

[edited by: gmoney at 6:03 am (utc) on June 21, 2003]

Dolemite

5:27 am on Jun 21, 2003 (gmt 0)

10+ Year Member



4/16 – I received X referrals from Google/Yahoo/AOL
4/17 – I received 2.8*X referrals from Google/Yahoo/AOL
4/18 – I received 3.5*X referrals from Google/Yahoo/AOL
4/19 – I received 3.8*X referrals from Google/Yahoo/AOL
4/20 – I received 5.2*X referrals from Google/Yahoo/AOL

Awfully good to get such nice April traffic on a site you launched in May. ;)

Anon27

5:30 am on Jun 21, 2003 (gmt 0)

10+ Year Member



Wow, Gmoney, that is great.

Did you mean to post 6/16... vs 4/16... for the dates?

danielm

6:20 am on Jun 21, 2003 (gmt 0)

10+ Year Member



For the fans of -cw and -ex, I think I have some disappointing news. It seems to me that whatever is occurring on there is using index data from February.

Here's my reasoning:

I work for an educational institution for which there is a one or two of such institutions in most major cities in the world. It is known generically by a two word keyphrase (a close example would be "art gallery" although that's not what it is). Specifically, most are known by three or four word keyphrases, e.g. "Uncle's Art Gallery". It is not a very competitive two word keyphrase.

Our situation was such that before my arrival at our workplace, we had a web site that was run by a volunteer on an offsite server in a subfolder under her main domain. I changed things so that we now have our own domain name, own server and far more content on our new site. The old site has had most of its pages removed except for a gutted index page. I don't have control over the site, otherwise I'd place redirects.

However, for the longest time (until Cassandra) the new site was being beat on both the two word generic and three word specific keyphrases. This is because until mid-Jan, I hadn't spent any time requesting that links be updated. Between mid-Jan and the deepcrawl that Cassandra was built on, I managed to change over several hundred old links and add new links, such that we went from having a 50 compared to the old site's 200, to almost 500 compared to the old site's 80.

What was interesting is that since the two word generic is a subset of the three word specific keyphrase, I could tell how close I was to surpassing the old site for the three word specific in each succeeding index by the relative placement in the SERPs for the two word generic.

I give all this background information, because there is no way any modern search engine index with up to date data would place the old site over our new site in the SERPs. This was reflected in Dominic, which did not incorporate data from that period, and the old site jumped ahead of the new one again.

For Esmerelda, the -fi server is serving what I consider to be respectable results. Roughly, for the two word generic, the new site has gone mid-70s, mid-60s, late-30s, early-30s, 15, mid-60s (Dominic) and 11 in -fi. In -cw and -ex, it is now back to mid-60s again. The old site in the same time has been late-teens, late-teens, early-20s, late-20s, mid-30s, late-teens and late-30s (-fi and -cw roughly the same).

Right now, -fi gives the new site the lead in the three word specific, and on -cw the old site has the lead. By every metric except age of page and age of links, the new site beats the old - so I can't see -cw or -ex being the final (unless something is really messed with Google).

This 395 message thread spans 14 pages: 395