|Google Updates and Everflux, the Monthly Mid-Cycle Changes|
Google does one major update per month, which generally begins anywhere from around the 19th or 20th of the month to approximately the 28th of the month. The update process continues for several days, with search results appearing to fluctuate as the update continues. Once the update has been completed, the new data migrates to Google's partner sites.
The regular monthly crawl takes place at around this time, beginning at different times for different web sites. The results of this crawl are generally reflected at the time of the following update.
For a number of months, beginning early Summer 2002, spidering of sites and changes have been observed to be going on all month, in between the regular monthly updates.
This has come to be known as Everflux, and represents Google's continuing desire and efforts to keep their search relevant, of high quality, and "minty fresh." The following information has been gathered from our Google News Forum here at WebmasterWorld in order to provide a consolidated resource for our members, particularly those new to WebmasterWorld and our Google News Forum.
GoogleGuy Explains Everflux
In a thread entitled Everflux - Google in index out of index [webmasterworld.com], GoogleGuy has graciously given us a concise, accurate explanation of Everflux:
If your site is new, or hasn't shown up in Google for long, it may because our "fresh crawl" (which runs each day) was finding your site instead of our main crawl (which runs about once a month). Our "fresh crawl" is a newer feature, and we're still experimenting with which pages to crawl, how deeply to crawl, etc. We even reserve the right to (gasp!) not do a fresh crawl on some days because we're doing tests or reviewing new code. Someone wrote in recently and said "my site got in Google three weeks ago, and you've dropped me four times!" Nope, it's just that we don't always crawl the same pages in our fresh crawl, and we don't always crawl to the same depth. As we do a full crawl of the web, we find most of the sites from our fresh crawl and put them in our regular index. My advice on our fresh crawl is to view it as a nice "bonus" on top of Google's deep index. Users can always search our full index, but sometimes we can serve up even fresher pages as an extra nicety.
What does this mean for the average webmaster? In the word of the great Hitchhiker's Guide, "Don't Panic." Just do the normal things you should do:
1. Create a great site.
2. Submit your site to Google on our "add url" form.
3. Get a link from the Open Directory Project or other directories (Yahoo, etc.).
4. Don't panic if your site takes a little while to show up in Google. Be patient, and start to look around the web--there's lots of great advice about improving your site for users and search engines.
Hope this helps,
P.S. If some moderator or Brett wants to lock this thread, we can just point people to it as a question that is pretty common.
Brett Tabke on Everflux
A concise, clear explanation, extracted from a recent thread entitled What is up on SERPs? [webmasterworld.com]
We call it Everflux: it can act mysteriously at times.
Here's the short story on it:
Google is constantly crawling and updated selected pages that meet some predetermined criteria. That may involve last modified dates and PR values.
Google has many data centers and runs a distributed load sharing system across more than 10k pc's running linux with 80 gig drives at last report. Somehow, the copy of the index must get transferred to all those hard drives in all those data centers. You ever transfer 80gig across the net? And then distribute that 80 gig down into thousands of hard drives?
All of that takes a great deal of time. It's a constant process for Google. More-than-likely, the daily updates only copy out those parts of the index that are really updated. That's yet another possibility where new and old data could get mixed.
Load sharing works transparently. You do a search on Google and the request is routed via dns magic to the either the nearest data center or the nearest data center with the least load (we don't know their load distribution criteria on that).
Lastly, they could be working on the index, rolling indexes back, switching parts of the index, backing up parts of the index, rewriting some offending part of the index, deleting parts of an index - or a multitude of other actions or problems that only Google could know about.
Take those combinations of not knowing which box you are going to connect to and which index it may have, and the possibility of daily updating going on at the same time, and results may be unpredictable. There could be dozens of different indexes floating around various data centers - we have no clue.
One minute you'll get one copy of a index during a search, and the next you'll get another. Sometimes that could be yesterdays crawl, or last months crawl, or four months ago crawl.
Regular Monthly Google Updates
You'll find information about Googlebot, Google spidering and crawls and monthly updates in our:
WebmasterWorld Google KnowledgeBase [webmasterworld.com]
Regular monthly Google updates have been documented and recorded since July, 2000 and are made available to us in convenient chart form in Brett Tabke's Google Update History, which is updated each month to keep it current for us:
Google Update History [webmasterworld.com]
For a good summary, see lazerzubb's
Google Update FAQ [webmasterworld.com]
Everflux at WebmasterWorld
One of the first discussions on the Everflux phenomenon took place in July, 2002 and was entitled Results switching - major changes in the last 2 days [webmasterworld.com].
Starting with September, 2002 we began our attempt to combine the many questions and discussions that related to Everflux into one discussion, where a brief and simple explanation was offered, in order to make the information more easily accessible and helpful to our members. There was another combined discussion in October, 2002.
FAQ: Everflux - mid-month Google spidering and minor updates - Sept. 2002 [webmasterworld.com]
Everflux - October 2002 mid-month Google spidering and minor updates [webmasterworld.com]
[edited by: Marcia at 2:33 pm (utc) on Nov. 7, 2002]
Wonderful, great job Marcia, as always ;)
Googleguy writes (as quoted by Marcia above):
> 2. Submit your site to Google on our "add url" form.
I had planned to post a question at the time and did not get around to it. I always believed that submission of the "add url" was completely useless. If you have links Google finds you, so add url is not needed. If you don't have links Google will ignore you, so submission is not useful.
So my question is: What exactly does submission do to help you?
Marcia, that is excellent.
Brad, so many people post who get nervous because their sites appear and then suddenly disappear. I was hunting high and low for that post of Brett's. When it turned up in a bookmark it had to be posted with GoogleGuy's so folks could know what the reason is for the ins and outs they're seeing where it's easy to find.
Mohamed, I also used to think the submit was useless, and it probably isn't absolutely necessary, sites do get found through links. But now I think Googlebot will grab anything no matter how it's discovered. To a degree it still won't help much with a new site if there isn't enough PR to rank, which is sometimes the case, but then again, at least Google knows they're around and will be back.
Some people are also concerned when they see no PR or PR1 with their sites, which doesn't automatically mean a penalty. It can just mean that's what the PR currently adds up to - not unusual for newer sites that don't have much in the way of links.
I've got a site that went up over a month ago that was submitted and has a couple of links, one from a page that gets Fresh crawls. It's not grey, it's toolbar guesswork PR4 because it's on my ISP space, but still not indexed; Google just doesn't seem to want it. :)
Marcia, nice overview!
I am not sure of the below, but I would not be suprised if something to this effect is in place:
I have a feeling there are two Everflux levels:
1. There are certain pages within a site that get Fresh crawling almost every day.
Mostly the pages with the highest Pagerank within the site, and preferably linked from already constant Fresh pages.
2. There are certain pages which get short-term Fresh status. These are the
pages that are linked to - newly - from Fresh pages mentioned in 1. (whether internal or externally linked to).
The criterium for a page being promoted to permanent Freshness is probably:
- Consistently receiving new links from Fresh pages in the recent past, with a certain Pagerank threshold.
- A Pagerank credit:
PR6 - three permanent Fresh pages for the highest Pageranking pages of a site.
PR7 - twenty permanent Fresh pages for the highest Pageranking pages of a site
this thread also covers part of the above:
Excellent. This should help a lot of people both asking and answering questions.
But I was wondering, if being the first to correctly identify an update is worth a mousepad, having the thread I started as the first everflux one, what does that merit.....
Do I have to send everybody else a mousepad?
Thanks for the link to that July thread, Marcia! I was bragging last week (as one tends to do when beer is involved) that I'd created a new word that was starting to spread across the web. I could show that the word was turning up on multiple webmaster sites, but I couldn't for the life of me find the thread where the word started. Now I get to drink for free this weekend. I owe ya a beer!
Submission gives a heads-up to Google that there is at least one change to go look at. Some say it's useless. But I use it whenever I want to bring a new page or site to Google's attention. Evidently GoogleGuy feels it has a role.
Excellent work! :)
Great post Marcia, very helpful.
Thank you, Marcia. Great post. The post DOES beg the question(s), however, and I'm a bit surprised no one's at least asked. So here I go:
Can we have some good guesses at the factors involved in updating SERPS for the "Dance" vis-a-vis the "Everflux"? For example, we know that backward links are recalculated and regurgitated during the "big" update, with the results having a varyingly significant impact. But what else happens ONLY then? An analysis of the theme/internal linking structure of a site? Perhaps?
Would that mean that SEO adjustments to titles, page kw's, meta's (even), etc. get counted during the ~daily tweaks? What's been everyone's experience? What changes can affect a SERP during the Everflux? Ideas/Answers...? GG? ;-)