| 10:17 am on May 14, 2003 (gmt 0)|
Thanks albert. One verr recent post by GG I found most highly significant of all - perhaps in the top 10 of all his hundreds of posts.
Basically he said that each data center is meant to do different things, and that sj was specifically meant to be good at finding TOPICAL pages (like those relating to SARS)
I think it's signficant becuase here we all assume that playing with indexes has mainly to do with spam-killing or some form of "relevance".
here GG is saying that at least sometimes there is more than one way to contruct an index/algo and it depends on the customer's motivation! That assuming that there is ONE best index, which we assume here a lot, is wrong.
Not sure where this is going but it will be very interesting. Diff indexes for diff partners? Or a multitude of things im not even going to mention yet!
Let's see what pans out in the next few months...
| 10:24 am on May 14, 2003 (gmt 0)|
It would be nice if you quote that post you mentioned, or just link it.
| 10:50 am on May 14, 2003 (gmt 0)|
Excellent summary Albert...
I have been monitoring several dozen websites, and I quickly noted that SJ was [at first] bumping up the SERPs on my sites it shouldn't have (new, smaller sites), and lowering my stronger ones.
After noticing GG's post that we would probably "see SJ show up at other data centers first, and then to start applying the newer data/filters after that," I came to the conclusion that it was too early to get nervous, and wait it out.
In my case (overall), I can see things slowly improving, but I'm a little concerned that the quality of G's search results have been highly questionable for quite some time.
Fortunately, I have been buried with projects for the past several weeks and haven't had much time for analysis of what seems to be a very looooong and frustrating process.
Other than reading tea leaves - or scanning a 'gazzillion' posts - the only thing we have to go by is GG's input as an indication for the future, so your summary is much appreciated.
| 1:04 pm on May 14, 2003 (gmt 0)|
Really helpful. Though I had read what he said, I hadn't realised he was saying the same thing all the time until now.
It does seem wierd what they are doing though. Putting sj's historical results with new algorithm on the partners, and continuing to update and add backlinks on google....
I mean, if it's so easy to apply backlinks and spam filters, why not do it? Why bother to send out old results with new algorithm to the partners? Why update partners and not Google?
Just seems strange to me.
| 2:34 pm on May 14, 2003 (gmt 0)|
>>It does seem wierd what they are doing though. Putting sj's historical results with new algorithm on the partners, and continuing to update and add backlinks on google....
I mean, if it's so easy to apply backlinks and spam filters, why not do it? Why bother to send out old results with new algorithm to the partners?<<
This, more than anything is what confuses me. I would love to hear some clarification on why they chose to use February backlink data and why it takes so much longer than usual to add backlinks and recent crawl data.
| 3:26 pm on May 14, 2003 (gmt 0)|
Perhaps you could edit your orignal post and put this at the end (GoogleGuys latest musing):-
|I wouldn't draw huge distinctions between sj and fi. When I say "I expect the sj index to spread to other data centers," that could be sj or fi. |
| 3:32 pm on May 14, 2003 (gmt 0)|
There's a lot of backlinks on the web. :) It will take some time to bring them all in. To clarify a couple things:
- it helps to think of sj and fi as similar. It's better conceptually to think of them as cut from the same cloth.
- chiyo, every data center has different machine characteristics. So similar/identical indices might look slightly different at different data centers. This goes back to the point above. Don't think of it as if we build a different index with a different theme for different customers. Our partners get the same scoring/data that we use. That said, the global index that we build can emphasize different things more, such as topicality or more diverse file types.
So sj/fi are different in several ways. I would expect that difference to spread to other data centers. Then things will resume moving forward.
Hope that helps,
| 3:32 pm on May 14, 2003 (gmt 0)|
Thanks - it's better these get appended to the original message.
PS I suggest you put the above up into your top message too - I'd be really interested to read these GG responses in one list.... not that I expect it to become any clearer than the mud it currently is!
| 4:01 pm on May 14, 2003 (gmt 0)|
GoogleGuy on May 5th:-
"backlinks are the sort of data that Google could bring back in over a relatively short time frame."
GoogleGuy on May 14th:-
"There's a lot of backlinks on the web. :) It will take some time to bring them all in."
GoogleGuy on May 5th:-
"albert, what you said, except I wouldn't be surprised to see SJ show up at other data centers first, and then to start applying the newer data/filters after that."
GoogleGuy on May 14th:-
"I wouldn't draw huge distinctions between sj and fi. When I say "I expect the sj index to spread to other data centers," that could be sj or fi."
"So sj/fi are different in several ways. I would expect that difference to spread to other data centers. Then things will resume moving forward."
I'm beginning to wonder if the posting of yesterday that suggested this may be a monumental c*ck-up by google, and they are now trying to pick up the pieces, is true.
I'm not having a go at google if that is the case. It happens. We've all experienced computer crashes, data loss etc etc - some of us (me included) on a large scale.
But what's the point of feeding people on here confused and contradictory information?
I do feel very sorry for those with sites who rely on google for traffic. It's not necessarily their fault that is the case - google created this monster, they know people rely on it, and they have a moral duty to control it.
[edited by: trillianjedi at 4:36 pm (utc) on May 14, 2003]
| 4:14 pm on May 14, 2003 (gmt 0)|
Chiyo / Googleguy,
Are you guys saying that Google is emphasising more "topical" or theme-based results now? I guess you won't answer GG but I'd like to know other people's opinion as far as Google identifying themes within link patterns and if that has anything to do with their new algo.
Chiyo - I would appreciate it if you could tell me where GG said the emphasis of SJ is more topical results, I can't seem to find it.
| 4:17 pm on May 14, 2003 (gmt 0)|
Thanks for your posts, especially GoogleGuy.
Concerning update of my opening post:
There's no 'owner edit' available in this thread ... hmmm.
If we keep it updated by posting only relevant quotes it might be a good overview nonetheless.
| 4:28 pm on May 14, 2003 (gmt 0)|
Yeah sorry, I'm probably just in a bad mood.
Too much coffee and too many postings of "I'm seeing -sj results on www - the dance has started!".
On that one, can someone please post something in the FAQ on this site about google's DNS?
<Rant is over>
| 4:37 pm on May 14, 2003 (gmt 0)|
Just a couple of thank yous here.
Thank you Albert for filtering out the good GG posts, and thank you GG for being so very polite and professional when dealing with the mad rants of confused webmasters.
| 4:37 pm on May 14, 2003 (gmt 0)|
trillianjedi, I'm saying that sj and fi are of a different nature than previous indices, and that I expect that different nature to spread to more data centers (ex, anyone? :)
I've been using "sj" to denote this different nature, but I appreciate the chance to clarify.
| 4:46 pm on May 14, 2003 (gmt 0)|
So GoogleGuy, more clarification please, if you can.
Our results on -sj and -fi are different. And we interpret your input to say they might indeed remain different.
That would leave one to believe that each datacenter will play by the same rules of the game, however the rules are applied to their own databanks which will be different here and there, so results will vary from center to center indifinitely, and therefore depending on different factors webmasters can expect varying results for Google searches as a general phenom and over time, not an exception. Which would be a change.
| 5:06 pm on May 14, 2003 (gmt 0)|
|he can really tell us much with out getting in trouble |
But he gives us some information. Imagine how we would poke around without him.
| 5:09 pm on May 14, 2003 (gmt 0)|
abcdef I think your assessment are correct that -sj and -fi playing by the same rules But the actual data (sites from april crawl only on -sj so far) on -fi is different than _sj.
| 5:12 pm on May 14, 2003 (gmt 0)|
The SJ and FI indexes I see what I have been seeing.
The WWW and the -EX are both showing fresh tags for a site but the number of results are just very very slightly different. Is that normal?
| 5:19 pm on May 14, 2003 (gmt 0)|
Good job Albert well done!
As far as the length of the update I hope it will be like good red wine the longer we wait the better it gets...at least thats what the girls I dated use to tell me...
| 6:43 pm on May 14, 2003 (gmt 0)|
I didnt have time to read any of the summary, but I was wondering if someone could sum them all up so I dont have to do any real work.
A very nice 'from the horses mouth' expose. For those finding contradictions, you should remember that the process is fluid, and they are trying out new things constantly, as it seems this is like no update they have done before, or at least this is the first time that we have been able to watch and ask about it to some extent.
If things aren't changing as they go to some extent, it wouldn't be like any dev environment I've ever worked in.
| 6:48 pm on May 14, 2003 (gmt 0)|
How 'bout denying access to sj and fi until all the datacenters get the same index, that would let of people here get a rest and make this board return to normal. ;)
| 9:53 pm on May 14, 2003 (gmt 0)|
x_m, you could always do that yourself by just not querying sj/fi. :)
trillianjedi, just to clarify on the other point, I mentioned that backlinks could be brought in on a relatively short timeframe, but remember that we are talking about terabytes of data here--the web is a big chunk of data. I think I also replied to someone else at one point that bringing in those backlinks wasn't the sort of thing that could happen in a day or two. Hope that sheds more light on things. Phrases like "gradually" and "over time" are cues that some types of data definitely can't be brought in overnight.
| 10:06 pm on May 14, 2003 (gmt 0)|
There's talk of backlinks and filters being added and applied, but I don't see data from April's deep crawl, yet.
This to me is the most important part.
At what point will the new info be added to the index and update pagerank as well?
| 10:14 pm on May 14, 2003 (gmt 0)|
Well GoogleGuy if you folks down at the Googleplex are too busy eating m&m's to deal with a few lousy terabytes of data, I will volunteer to take a couple kb off your hands and manually rank SERP's for some keyterms. Of course, I am only qualifes to rank SERP's relevent to my own websites.
| 10:24 pm on May 14, 2003 (gmt 0)|
Albert, Thanks so much for that post...Dominic is getting so overwhelming and time consuming that I was resorting to quickly scrolling down just to read GG's posts. You've saved me a lot of time I don't have.
| 10:40 pm on May 14, 2003 (gmt 0)|
Fi and SJ are very different IMHO. I have a site that is #1 in SJ and is not in the index in FI and is not a site that do any agressive SEO on to trigger any filters.
Given that FI is the source for AOL's google results, and that SJ has not materialized anywhere yet, I wonder if this means that the FI feed will become the main one?
It seems funny that they would release a version to AOL that was using old data, yet from my experience pages are still missing. The only page from my sites I can find in the index is one from several indexes ago, so I think FI is working on old data.
| 11:06 pm on May 14, 2003 (gmt 0)|
GoogleGuy's comments indicate to me that each data centre will become different to the others permanently, with each datacentre having its own theme.
That makes sense, at least to me. One of the biggest gripes against Google is that sites become totally reliant on them. If you now rank well in only half the Google data centres you'll recieve half the traffic. This should go some way to relieving the webmasters' fear of being totally dropped.
It also makes sense from a theme perspective. You might want to conduct several types of search: academic, commercial, current affairs etc. Now there may be a chance to get that if Google clarifies which index serves which purpose.
Or of course it could just be wishful thinking?
| 11:11 pm on May 14, 2003 (gmt 0)|
Just to clarify:
I thought GG said that sj/fi were differently themed from the other data centers, not from each other - though it's not entirely clear, as different posts have picked up different on what s/he said.
S/he referred to "sj and fi are different" - but s/he didn't mean from each other, if you re-read his posts.
(Hey, maybe it's actually GoogleGal :-)
| 11:25 pm on May 14, 2003 (gmt 0)|
sj/fi are of the same nature, which is different from the other data centers. At least for now..
| This 40 message thread spans 2 pages: 40 (  2 ) > > |