Forum Moderators: open

Message Too Old, No Replies

SERIOUS Google update algo analysis thread (Dominic)

NO whining or cheering about how your site is doing in this one.

         

rfgdxm1

6:21 am on May 6, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This is a continuation of an idea for a thread that I started a few updates back. The topicality is listed below, and the expectation is that this thread will be restricted to just that. In the case of this Dominic update, GG has stated that other aspects of the update will be rolled out as the update develops. Thus, for this update it is possible that the observations made early on will not hold true by the end of the update. This is OK, because if patterns like this hold true for later updates, members here can use the search feature to find this thread and see how past updates developed.

----

I'm starting this thread because another member suggested such would be a good idea because the main Google update thread is cluttered with posts like "OMG, I've been dropped in the new index!" and "Yippee, I'm now #1 on a key SERP". This thread is ONLY for serious, generic discussion of changes that you are observing with the new algo in this update. As in things like "Looks to me like PR is less important this month, and anchor text of inbound links counts more.", etc. How your site is doing has no relevance here unless you can explain why you think so in terms of a general algo update.

biggles

7:03 am on May 9, 2003 (gmt 0)

10+ Year Member



Thanks rfgdxm1 - that makes sense and very appropriate given the virtual storm of postings that follow a dance.

NeverHome

7:35 am on May 9, 2003 (gmt 0)

10+ Year Member



The Hurricane Name Model is an excellent way to refer to each Google Update, but after Dominic I reckon the Richter Scale [seismo.unr.edu...] would provide a more suitable paradigm to gauge the destructive force (or lack thereof) of each update.

I put this one as a 6.0

Powdork

7:44 am on May 9, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I put this one as a 6.0

I was thinking it was more like an f5 tornado, but then, it must be a hurricane, cyclone, typhoon; for now were in the calm of the eye. When we come out the winds will be blowing in the opposite direction.:)

NeverHome

7:47 am on May 9, 2003 (gmt 0)

10+ Year Member



Powdork... well I thought "At most slight damage to well-designed buildings. Can cause major damage to poorly constructed buildings over small regions." just about summed it up so far!

h_b_k

1:51 pm on May 9, 2003 (gmt 0)

10+ Year Member



I don't know, what kind of filters google is testing on www-sj and www-fi

but I will describe my detection:

I have two differend sites, which are both not very good optimized to a keyphrase using synonyms of my main keywords.

One site is a personal homepage, describing different web-projects, the other site is of one of this projects.
The project has a page about projects motivation.
The personal homepage has a page describing the motivation and structure of the projects site.
Of course there are several similar paragraphs in this two pages.
And of course there are reciprocal links between this two sites but not dircetly between this two pages.

Okay, what I have detected using this non-optimal keyphrase is:

on www both pages are shown in the first 10 pages SERP

on www-sj and www-fi only one of its (the personal homepage with project description) is shown in the SERPs

but, using the link "repeat the search with the omitted results included" after the very last search result, google produces a SERP with both pages (from both sites) are in the first 10 pages SERP

is this interesting?

annej

12:38 am on May 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



because of a surge in new widgetting sites? It's a very popular hobby...

Yup that widgeting is sweeping the net. ;) I think it got started here are webmasterworld.

Back to -sj. Even though it seems likely the current sj results won't be the final ones it would be interesting if we could figure out what it is doing right now. It looks to me like -sj is working with the same data now. Both www and www-sj seem to have the same fresh bot info. I don't know about the deep crawl though. Since my sites get fresh dated regularly it's hard to tell

Does it look like any spam filters have come into play yet? I can't tell in my topic.

I haven't seen a change on www or www-sj in backlinks for some time.

I checked out a little more data on my two same themed sites. It seems like everything is backward there. I use the word sites but I mean the homepage of each site.

"Widgeting" has a higher but not unreasonably high word density of the word 'widgeting' while "Widget" has a much lower WD.

"Widgeting" has far more backlinks than "Widget". This is true in comparing both www & www-j

"Widgeting" has a PR 6 while "Widget" has a PR5

"Widgeting" has the keyword 'widgeting' in the title, h1 tags, h2 tags and meta description while the "widget" site does not have 'widgeting' in any of them.

The writing style, page lay out and types of backlinks are very similar for both sites.

It is totally beyond me why the expected one is ranking so much lower on the keyword "widgeting" than the widgetingless site.

There has got to be something here I am missing and it may be the key to it all.

Anne

mrguy

12:41 am on May 10, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



From what I see, it looks like they are going easy on repeat spam offenders.

A site that was banned for numerous things less than a month ago has now just popped back into the serps with their Doorway pages intact.

I would think a filter would see the doorway pages, but maybe they are OK now.

I give up!

sullen

10:19 pm on May 13, 2003 (gmt 0)

10+ Year Member



Anne - I'm interested in this. Does the widget site link to the widgeting one and / or vice versa? Is one particular backlink for the widget site being given extra weight or something?

Powdork

10:29 pm on May 13, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sullen,
Since its already May 13 where you are can you tell me when the backlinks will be added in?

<added>nevermind;)</added>

northweb

10:35 pm on May 13, 2003 (gmt 0)

10+ Year Member



I've notice many of my pages that were added in March and April are gone in sj, www2 and www3. Anyone else running into this? Any ideas if they will show back up shortly or will I have to wait till next month?

Aylah

10:36 pm on May 13, 2003 (gmt 0)

10+ Year Member



Wow I was sure it was may 11...guess I slept in longer than I thought :)

percentages

2:42 am on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Before we can discuss serious algo changes on -sj or -fi I think we have to wait for data stability. At the moment it looks like algo changes are going to happen, but it is difficult to tell with only half a picture to look at.

Here is what I see:

1. Google's snippets seem to be from April 9th for most PR5+ sites I checked. Typical example is the Whitehouse site. - Snippet from 4/9/03

2. Google's cache is anywhere from April 9th to May 3rd and this seems to depend on PR/Freshbot activity. The Whitehouse has a cache from May 2nd, even though the snippet is from April 9th. Can't find any cache after May 3rd which is the BBC's site and they have a slight time difference to take into consideration. Yahoo, CNN and all other big players seem to be May 2nd.

3. There seems to be a problem with sub-domains and links. Try searching for webmasterworld.com (no www.) on -sj and it doesn't exist....add the www subdomain and it reappears.

This may explain the drop in backlinks for some sites. On www link:yahoo.com and link:www.yahoo.com produce the same results, not so on -sj and -fi.

I don't see Google passing these results to www until at least this issue is resolved.

I have no real idea what Google are up to on -sj and -fi. Not a lot seems to be changing and things seem to have stalled.

I have sites that have lost pages that were definitely available when the rest of the site was crawled and cached on April 11th. I have to conclude that either some of the April deepcrawl data has gone MIA or Google hasn't yet applied it.

If, as some have stated, we are seeing the update in progress and the "dance" or moving of data to www will happen shortly then I have two questions.

1. Surely the backlinks should be changing as new sites/pages are added to equation?....I've only seen one change in backlinks in the last 5 days. I would expect to see changes occur more often if not continuously.

2. The lack of activity suggests that either Google has now hidden the process (so we can't see it) or they have halted it all together for some reason.

I feel like I am watching a group of programmers debug their code...never a pleasant experience and typically takes incalculable amounts of time;)

I seem to remember reading GG say we would see the -sj and -fi results filter down to www during the next few days...how many is a few? Seems like it has been a lifetime already!

Once something shows up on www then maybe we can really start to get our teeth into what is happening, until then it seems too bizarre and for some painful.

The next probable problem is that with Google using deepcrawl data so soon after results are moved to www there is no opportunity time available to adjust for any algo changes. Can't adjust now with half-baked results, can't adjust later for the next update if the crawl again happens immediately!

If this update does ever happen it could easily be two more updates before any advantages can be taken or damaged controlled....whichever way it turns out:)

cheater copperpot

2:55 am on May 14, 2003 (gmt 0)

10+ Year Member



btw i do see May 13th as the date here in the WW forum.. did i hit a time warp in this post?

Anon27

3:59 am on May 14, 2003 (gmt 0)

10+ Year Member



I do not mean to be a party pooper, but can we put this thread back on topic per rfgdxm1's start of the thread:

"SERIOUS Google update algo analysis thread (Dominic)
NO whining or cheering about how your site is doing in this one."

Period.

Thanks,

steveb

4:45 am on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"can we put this thread back on topic"

LOL, there has been no update! The topic is premature regarding an update. The only thing to speculate about here is the data shift and observations about it. Once an update occurs then people might be able to analyze the update. Those who can see the future of course are free start without the rest of us....

rfgdxm1

5:02 am on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The problem is that this update is far different than any I have ever seen. Usually, the early results of the update aren't that much different from the final results. And, every update I have seen has been finished in less time than this one has gone on. It appears that for whatever reason, someone at Google hit the "pause" button. We really need to wait for this to all finish before doing serious analysis.

Anon27

5:23 am on May 14, 2003 (gmt 0)

10+ Year Member



I did not mean to come across as a hard A$$.

This is the only thread that I monitor intensely, and for good reasons. While I read many others, this one has a focus, and that focus looked like it was going off course.

It does not matter if this is a current update or not; which ever camp you are in is not important. This thread is important.

That’s all.

wackmaster

6:19 am on May 14, 2003 (gmt 0)



<Before we can discuss serious algo changes on -sj or -fi I think we have to wait for data stability. At the moment it looks like algo changes are going to happen, but it is difficult to tell with only half a picture to look at.>

Indeed! And just to be clear, we are STILL seeing numerous sites in -sj and -fi that show data from the Feb. index. So G is not using a complete recent index - maybe there's some new way of updating it on a rolling basis, as I think others have also speculated.

If Google plans on publishing the current -sj without updating the index fully, then we have a real mess on our hands...but that still seems highly likely only because those guys know something about quality.

So if the new index is not fully in -sj yet *and we see that it's not,* then, we're playing with less than a full deck...

Zapatista

8:01 am on May 14, 2003 (gmt 0)



This is turning into the eternal update.

I hope they never update and I am having serious doubts they will which is good.

Many companies die when they start trying to micromanage it to death in an effort to fix something that was never broken.

A thread a long time ago told Google of improvements they could make as suggested by webmasters here. None of them have even begun to be implemented and instead we get this alien.

I think the whole internet is over-rated. Thank god for lexus nexus and findarticles.

steveb

9:33 am on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"This is turning into the eternal update."

My evolving view... Google's March deepcrawl was so botched that they felt that had to revert to the February deepcrawl data first before then applying the April deepcrawl data. This explains why these old/trashy results are mixed in with as much freshbot data as possible. The update on the other hand will occur when it normally would have time-wise in relation to the previous one. Which would mean that now is about the absolute earliest we could have expected one, but more likely it will be in a week or ten days.

Lesson: an old deepcrawl plus freshbot data produces lousy search results.

cheater copperpot

10:34 am on May 14, 2003 (gmt 0)

10+ Year Member



>>This explains why these old/trashy results are mixed in with as much freshbot data as possible.

I would love to say that also steveb...

But unfortunately i have seen googleguy say multiple times that this is NOT an old index and argues that the sj results are fresher than www.

Scary..

Powdork

10:34 am on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Suppose we're not looking at a continuous update rolling in. Maybe its just a new algo and its not ready. It doesn't mean its late. Since November 27 there have been 4 updates in 136 days. That's 34 days per update. On time would be May 15-16.

It seems that the deep crawl cycle is now every five weeks, or just over 10 per year, rather than twelve. Two deepcrawl, dance cycles must represent a substantial savings.

steveb

12:21 pm on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Don't misunderstand what he is saying. Obviously the backlinks are in fact older, and also the toolbar PR. "Fresher" means more freshbot pages. "Fresher" in terms of a percentage of pages isn't meaningful when old backlinks and PR are displayed, and presumably used.

wackmaster

5:27 pm on May 14, 2003 (gmt 0)



<But unfortunately i have seen googleguy say multiple times that this is NOT an old index and argues that the sj results are fresher than www.>

<Don't misunderstand what he is saying. Obviously the backlinks are in fact older, and also the toolbar PR. "Fresher" means more freshbot pages. "Fresher" in terms of a percentage of pages isn't meaningful when old backlinks and PR are displayed, and presumably used.>

Yes, still looks like an old index, plus input from Freshie...and not just older backlinks and PR...

As of Monday a.m., still true that we see numerous listings in -sj reflecting OLD crawls...no more recent than March, and in fact looking more like Feb...

Big outstanding questions IMHO include:
--Will the April deepcrawl find its way into -sj before -sj goes live (IF it's -sj)?
--How many filters are not yet turned on in -sj?

Sure hope there is an oven somewhere in the Googleplex that is baking the new algo into the newest index...

parabola

5:36 pm on May 14, 2003 (gmt 0)

10+ Year Member



>>Lesson: an old deepcrawl plus freshbot data produces lousy search results.

Steveb, you have it on the money.

I think some webmasters may be confused as fresh results are showing on sj/fi, but the fact still remains that this is an old index.

I can deal with algo changes, new filters, update changes, etc..BUT not getting credit for ANY links from the past 2 months, even links that are showing in google.com today, would be a tough pill to swallow

wackmaster

5:44 pm on May 14, 2003 (gmt 0)



parabola,

<I can deal with algo changes, new filters, update changes, etc..BUT not getting credit for ANY links from the past 2 months, even links that are showing in google.com today, would be a tough pill to swallow >

We're assuming that the recent backlinks and the recent index go hand-in-hand. When we see the recent index show up, the newer backlinks should be there too.

We're just wondering if the new index will be baked in before -sj goes live. anybody's guess...the Googleplex moves in mysterious ways.

parabola

5:53 pm on May 14, 2003 (gmt 0)

10+ Year Member



wackmaster,

Google Guy has stated repeatedly that he expects sj to show up on more datacenters first, before we see any new backlinks added. This is what concerns me because it could be for some time before we see a somewhat current index after this occurs.

wackmaster

6:02 pm on May 14, 2003 (gmt 0)



This is what concerns EVERYBODY, I think.

I know this is not helpful empirical data, but I just don't believe they will go live with such an old index. They could have published by now, right? So, why haven't they?

There had to be a reason why Google let us in on the build this time. It had to be because they wanted the input. Now they've got it. Something is making them wait.

It doesn't affect us much one way or the other; we're OK in both www and -sj.

But GG is allow to change his mind...or...others at the Plex may have changed their minds given all the input, and either way, eventually the newer index will appear and all with be well with the world again.

annej

7:21 pm on May 14, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks to Sullen I now know the big factor that brought my 'widgeting' page down in sj. It appears that sj is giving more weight to root directory index pages and my 'widgeting' site is the index page of a subdirectory. (mydomain.com/widgeting) My 'widget' site is widgetdomain.com so that explains why it popped up above 'widgeting' in spite of 'widget' having less backlinks, lower PR, etc.

So one factor with sj seems to be that preference is given to domain.com over domain.com/whatever.

gpmgroup

7:59 pm on May 12, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Looks like the wait is over... time for some serious
analysis. The new index on www looks very like www-fi
This 263 message thread spans 9 pages: 263