Forum Moderators: open
I also notice it seems that Google has 2 banks of datacenters. Only one of the 2 does the partial update. Next partial update, the other bank of datacenters is used. Looks to me like -ex, -in, and -zu are involved this time. And, possibly -va. However, it could be that all of these are being rerouted all to one physical datacenter. Looking at traceroutes this may be the case.
Strangest thing about the SERPS is that pages are #1 on some datacenters and #100-500 on others, so you wind up getting traffic on and off for the day. CW even has some pages that were buried showing up out of nowhere.
1. my index page disappeared for three days,
2. I could find it by looking for other key words, although it appeared indented under the contact page.
3. Each of those three days it had fresh tags.
4. this morning no fresh tag, and it's back where it belongs.
Also, Marval, I think you pointed out that the missing index page can sometimes be found with other keywords. I noticed that too.
My title is five words. It was top 10 for "word1 word2", or "word3 word4". While missing, I discovered I could find it with "word1 word3 word5", although again it was indented under a contact page or something.
I also discovered that using the dance tool with 25 iems was hiding a lot of the activity. I now use it with 10 and am catching changes.
1. Backlinks from guestbooks are not filtered
2. ODP weigh heavily on ranking even to a point that changed site theme or retired domains are still ranking on their old theme, some time at #1.
3. No clustering of domains in the result from time to time. Meaning, say the serp would show #3 domain.com/page1.html and further below say #12 domain.com/page2.html
4. Google superbot doesn't go deep on sites with PR less than 5
Feelf free to add your observation
Maybe you're right, maybe there are exceptions :)
g1smd,
Just to humor you, I've done exactly what you have said and the result is...there are now 5 URLs respresnting 1 domain spread though out the result. Tried it in -ex and -fi, the same thing.
Used to be, when there are more pages for a given domain for a specific query, at the most you only get 2 URL and a link for 'More results from this site'.
Thought the index had settled.....index page was back and doing well for important keywords for 2-3 days.
dissapered again 12 hours ago.
When my index page was showing about right for most important keyword searches in the last couple of days there were no fresh tags....although other sites unaffected during this unstable time were showing fresh tags.
At the same time....finding index page for lesser searches that remain the same and don't drop in and out every couple of days (as they do for most important searches) did show fresh tags? So the stable pages show fresh tags for the index page....but, at the same time, when the index page is back in it does not show fresh tags. Same index page, different search keywords. So for the searches that are stable the fresh tags are shown for index page and consistantly you know if there are no fresh tags....the index page for those searches will come and go.
This is the only consistant thing i have been able to see.
Thoughts?
Who knows really? I think the dance seems to be over, but with MAJOR unstability still for reasons unknown.
Pre update - index page #6
During update - index gone
6-26 - index reappeared at #21 (w/Ftag) but was stable all day
6-27 and 6-28 - index dropped further to #47 (w/Ftag) again stable both days at all DCs
Today - index completely gone again, replaced by my other pages ranking #70 and #71
I made NO changes during this time, so there is NO pattern I see. Last night I removed H1 tags to see if it would help. Shortly thereafter, Freshbot crawled deep, so we shall see tomorrow if it does any good.
Since then, all of my sites have been solid; but a few others I have looked at have moved around a little, or been slightly different in the different datacentres.
Some people are reporting roller-coaster rides this last week, but i am just not seeing that here.
It's been almost two months now. Longest cold in history.
That is the best analogy I've heard so far.
They almost had it right, then boom - they revert back to some crazy ideay they had we call Dominic.
I strongly believe they are having SERIOUS issues with the data. No other reason for pages popping up to #1 on and off for days then dissappearing entirely and reverting to dominitis.
Wonder if Yahoo! and AOL appreciate showing their users new SERPS every 20 minutes? :)
Tonight I've just noticed significant ranking changes on older sites which have been very stable for more than six months.
I guess this means that new ingredients are indeed being added, and/or rankings are still being recomputed.
Still missing/upside-down topical/index results in places, but several positive moves from sites that Google has a way to judge their value, at the expense of fresh piffle that has been tossed up in the past month.
Maybe I'm being too optimistic, but this is the first glimmer of positive development, although without the righting of the topical/index pages it is still scary.
- Private Forums
-- WebmasterWorld Supporters Forum [webmasterworld.com]
--- Liquid PR? Big changes afoot?Brett_Tabke - 10:44 am on April 12, 2003 (cst -5)
The Google Update describes the reindexing behavior of Google. Google has two modes of refreshing and adding sites to its massive 3 billion page index.
1) Full Update.
Based on a full crawl of the the web to acquire all pages that it can. All pages are refreshed and the search results adjusted.2) Continuous Daily Indexing
Google now has the ability to update the index based on continuous crawling. The crawling is done via a spider we have nicknamed FreshBot. The name comes from the fact that Google adds Fresh! next to pages that have been updated within the last 72 hours.Full Update:
Googles full update has occurred approximately once a month for the last three years. We maintain a Google update history page.When the full update occurs, results have historically floated back and forth between the new and the old results for a few days. This behavior seems to perplex many site owners and operators.
Search engine history buffs recognize the flip flopping results behavior. It is the process that drove seo's to the brink of insanity when Inktomi previously serviced Yahoo in 1999. With data centers in California and in Virginia, Inktomi results would appear to switch at random as queries were routed between the two centers with different indexes.
Google is compose of more than 50,000 PC computers setting in six active computer data centers around the planet. There are two additional centers (simple offices?) whose purpose is unknown and unused. Google is not specific about how many computers they have. They say they do not know themselves; Who can count that high?
The heart of the Google system is a heavily modified flavor of the Linux operating system running on standard architecture PC's. Each PC reportedly has a single 80 gig IDE drive.
There are two key components that drive the business end of the Google operating system. The first is a custom file system that supports large files. When we say large, we mean HUGE as in they span the entire 80 gig drive. The system then uses a proprietary random file access method across the entire index file. Computer buffs will recognize the technique as the same one employed by Commodore random access file types such as those supported by the classic Commodore 1541 floppy drive.
The second component of the Google system is a custom web server. Little is known about its origin. We have heard that it started out as an earlier version of Apache and was modified and optimized for pure speed.
There are currently in excess of three billion pages indexed by Google. If you take a web page, strip it of the html code, and then compress it with something such as Zip, the size of the file will fall dramatically. This web page you are viewing right now, could fit in under 1k of data. This is how Google is able to squeeze a three billion page index on to one single drive.
In addition to the large main search indexes setting on the PCs, there are other computers that serve cached pages and services such as Image Search, Google News, Froogle, and the Google edition of the Open Directory Project.When a user visits Google and performs a search, the search will be processed by one PC. The query is most often routed to the data center at the nearest psychical location to the user. What results the user gets back for any query will depend on which index that particular pc has stored on its local drive.
When Google updates its index, that index must be distributed to all those computers. The file is probably (Google has never stated) transferred down to a single computer at the data center, which in turn distributes it to a cluster, and finally down into each pc. The amount of data that must be transferred and retranslated to update each PC is into terabytes - possibly petabytes - of data.
Remember that you don't know which data center you are going to connect with, or which index that PC will have on it. It could be the new index, or it could be the old index. This will make results appear to fluctuate to the user during updates.
Last year, Google introduced a new method of continuously updating its index. We first noticed it when they started to add FRESH! and the date it was last indexed next to each search listing. Somehow, Google had built a system that could almost continuously update itself with fresh listings.
As you are aware, at the heart of Googles technology is a rankings algorithm based upon web citations (backlinks) called PageRank (PR). The calculation of PR takes a great deal of time and has lead to the speculation that it is the reason that Google only refreshes the full index once a month.
Theoretically, Google could calculate the PR value for any page entirely on the fly. The system could download a page, extract the links on that page, update the database entries for all those target pages, and finally recalc the pr of the current page and move on to the next page. Thus, PR would move from being a static monthly variable to an ever changing Liquid variable that is continuously updated. We feel this is exactly what Google has been moving towards with its FreshBot indexing.
The phenom is simple: Fresh pages rank higher than other pages. Although there appears to be a Fresh listing cheat factor that is placed on every listing, we don't think that is entirely what is occurring. Based on subtle clues, including the behavior of the resent index, we feel Google has moved towards Liquid PageRank that is continuously updated. It may mark the complete end of the monthly updates.
Brett also did an interesting post in rfgdxm1's thread too.
[webmasterworld.com...]
Looks like an endorsement from here. :)
Fresh, everflux and superflux very likely are the path of the future, but it would be very wrong to draw conclusions about that based on the pattern of what has been occuring. What has been occuring was stated to be more in line with the past than with the future.
This is a very important consideration for webmasters sitting around considering how to make their sites more Google-friendly. You have to think about he future, not what is past, even the recent past.
I tend to agree. GG likely can't come out and directly spill the beans about what is going on inside Google. However, he certainly isn't casting any doubt on this speculation about a continuous, rolling update; and in fact seems to be hinting that this guess is right. Plus, I can see no other explanation for what is going on at Google other than this. I am seeing all kinds of periodic shifts in SERPs that can't be explained by traditional everflux and freshbot action.