|Why does the 'Google Lag' exist?|
Trying to understand its purpose.
I had some in-depth discussion this weekend with some friends about the sandbox. Every theory on how to beat it kept coming back to one central problem - no one is sure why it exists.
I feel very strongly that until we have a good grasp on why it exists, it will be very hard to beat.
I don't buy the explanation that it's intended to be a method of stopping spam. Why? One, there's too much collateral damage it is doing. Two, if you accept the 80/20 principle (20% of spammers are doing 80% of the spamming), and you realize that there are multiple ways already of beating the sandbox that all of those spammers are aware of, it doesn't make sense anymore.
So, why does the sandbox exist?
The most obvious effect of the sandbox is that it prevents new domains (not pages) from ranking for any relatively competitive term. So, start thinking like a search engine - what would be the benefit of this?
>Still, I lean toward thinking lag time exists to combat "fake quality", that is, sites buying high quality links to pretend to be of quality.<
So please explain how can g tell the difference between "quality" and "rubbish" according to PR?
not to mention if they are paying (ok some make it obvious)
jeez, nassa cut.. that new link to me please -;
Maybe g checks out all links to "rubbish sites"
I don't think so!
Take a look at all "big sites" most provide links to "rubbish sites"
"So please explain how can g tell the difference between "quality" and "rubbish" according to PR?"
That is the business Google is in!
|So please explain how can g tell the difference between "quality" and "rubbish" |
What if the Google just began to update the index and PR continuously and not once in n-th month?
Itâ€™s by the way explains why there are no major PR update. It may be no more major PR update in future. However individual PR will be updated.
Then Google can get reliable results for new sites only after some time, needed to crawl all sites in the Internet. If the new site happen to get too much inbound link at the start, Google sandboxes it until it has the whole picture.
It also explains why the new sites often initially get some PR. If their PR is not deviate significantly from the average PR in the Internet, this PR may stay. If it deviates, it is sandboxed until Google gets the whole picture.
What is this so called sandbox? im a noob... teach me.. =P
>>lag time was created in part to help stabalize the serps pre-ipo
I've heard this mentioned by some, but I don't get it. I'm clueless on stocks, but would one really use the current/then SERPs as part of a buy decision? If I were an interested buyer, I'd want to see the company act as it always had: "Stabilize...you mean Google was 'unstable' before?". Would Google really throw a wrench into a smooth system that's already a sure bet as an IPO?
It also seems this type of deliberate action would lead them to the slippery slope of potential SEC violations--stabilize the SERPs right before a split or shareholder's meeting? I can't see it.
|it might be evidence that Google is no longer exclusively looking at the web as a bunch of pages, but as pages that belong to sites. |
Interesting posts scarecrow, thanks.
I'm not sure if this fits with what you're suggesting, I changed domain names 2.5 months ago on a well established site, growing fast, decided to switch before it got bigger and suffer now rather than later.
301'ed old site to new site, all old content deep in sandbox re SERPs, however most new pages and content I'm adding are not in sandbox, but rank fine for their specialized terms, about the same as anything I put in before I switched domain names. Two possiblities that I can see:
If new content is not sandboxed and old content is, google is comparing the old site to the new site somehow, and letting in the new stuff but keeping out the old.
it's just the original pages that are sandboxed? But that doesn't fit with what many people are reporting, full sitewide sandbox on a new domain name, all pages.
But it's hard to see how that would happen on only old pages and not new ones without some reference to the site as a whole if a sitewide sandbox exists, which it doesn't seem to in my case, in other words, google is differentiating between the group of old urls and the new urls despite the alleged sitewide sandbox affect. Many of the 'old' urls have been extensively rewritten during this time, but still are buried, so it's not the content per se of the pages.
<added>another oddity: after 2.5 months, doing an site:olddomain.com which has been gone, 301, now for that whole time, reveals 3 pages, very old, 2 haven't been physically present on the site for about 10 months, from a section of the site I took down, but which is still linked to I guess.
Jake, how come then a site registered three weeks ago is ranking and it's not a DaveN Special, it totally missed the sandbox, penaltybox or googleslag or whatever you wnat to call it, I like GOOGLESLAG hehehe but don't know her personally ;)
check your sticky ;)
There is so much speculation about why it exists. Perhaps it's just that Google is broke in some way?
Lets all start calling it the GOOGLE FLAW. We'll try to make this the industry name for this. The press will eventually pick it up and Google will be forced to come clean. :)
(I'll also say that I am pretty sure lag time was created in part to help stabalize the serps pre-ipo. Why it contunies to exist is more puzzling.)
If that's the case then it's probably because as someone had said there is (may be) a lock time in which certain people cannot sell their shares.
I don't believe the sandbox is a specific thing. It seems to me to be the result of a number of different factors.
The way I see it, is that for a competitive search term, a page (or site) has to acheive a certain score to be considered. How the score is acheived is the key - number of links, age of links, on-site factors and so on.
I also think the value of the age of a link depends on the age of the search term. By which I mean if a search term has been around since the dawn of google, then a new, say 1 month old, link for that search term has minimal value. Whereas a comparatively new term, say "Widgets 2004" is much easier to rank for.
If you see what I mean.
I just don't believe that all new sites are shovelled off to some holding area. There is too much evidence to the contrary. Having said that, something is causing new pages / sites difficulty in ranking for competitive search terms. Non-competitive is easy.
Why does it exist?
|Watcher of the Skies|
My two cents:
I think Google simply raised the bar in the algo on the number of a.) local AND b.) expert documents required to link to you. Without focusing on this specifically, people simply do not get into the initial ranking group where then the subsequent re-ranking pays more attention to content.
Here's a point I don't think a lot of people are getting. I had one 3 word term I was in ranked somewhere over 500 for. If you did any of the allin searches I was number one. Then "magically" on the weekend of May 10th I jumped to number 1.
I should have moved up slowly, not have jumped to the top from nowhere.
|I should have moved up slowly, not have jumped to the top from nowhere. |
Not if ranking is a two stage process.
1) Establish the top set of N sites.
2) Rank the top set of N sites.
"I don't believe the sandbox is a specific thing. It seems to me to be the result of a number of different factors"
hmmmmm...ur right...or wrong..,money KW >>>new pages ...goodbuy..._generic topic sites like ..(the battle of widget....pictures of the radio veronica ship in Maidstone... can be #1 with the good old SEO.
When you do things the same way for a long time, and experience the same results, i.e. create a site, get links, see PR go up, see your positions improve, your dealing with a known entity.
In March of this year that known entity changed. This thing exists, and if you have found a way around it you are in the extreme minority and should count your blessings, and pat yourself on the back.
For the rest of the SEO world, and the great majority, the change has appeared, and it is quite consistent. The reason this is such a good thread is because the man is posing the question; “WHY”
Why will a new site be crawled, indexed, and appear in the results correctly for obscure terms, but not for the significant key words you designed the site for in the first place?
Is it accidental or purposeful? To me that’s where to start.
|Watcher of the Skies|
thank you, leveldisc, exactly what i was saying...please take note, graywolf
Well to me, it's an algorithm / threshold change that makes competitive terms more, err, competitive.
It behaves as if there is a threshold based on the competitiveness of the search term.
For example, I have a 6 month old site called
It ranks as follows
brandname - #1
brandname competitive term - #30 (+/- 10)
competitive term - #350 (+/- 50)
This hasn't really changed for 4-5 months. New links are added and older links age. I think once the (number of links x age of links x link quality x unknown factor) hits the threshold, I'm in.
So, anyway, I think the start point is to forget about a sandbox as a concept.
It's just harder to rank these days.
|thank you, leveldisc, exactly what i was saying |
Sorry, didn't mean to plagarise!
|I just don't believe that all new sites are shovelled off to some holding area. |
Nothing's being shoveled off anywhere. The "Google Lag" is a side effect of an algorithmic evaluation.
Remember, these sites are still in the index - they're just not ranking.
|The "Google Lag" is a side effect of an algorithmic evaluation |
That's what I'm saying. That's all it is. A side effect.
Its one hell of a side effect thats for sure.
Right now, and for the past 7 months, it's been virtually impossible for someone to launch a new web site and then monetize any amount of significant organic traffic from it.
You just might be right because it's hard for me to believe that is the master plan at the Googleplex.
>>I just don't believe that all new sites are shovelled off to some holding area. There is too much evidence to the contrary.
i'd like to know waht evidence you have.
all the symptoms you're experiencing can be explained by the "sandbox" index (for want of what to call this new index. it behaves like the supplemental index, meaing:
- shows in the serps for non-competitive terms; i.e. low number of query results.
- does not appear at all for competitive terms; i.e. if there are sufficient reults from the main index, then ignore both the supplemental and the "sandbox" index.
- serps will show site:, etc queries just like the supplemental index.
- no pr, no links. again like the supplemental index.
no need for complicated theories/conspiracies such as pr lag, sandboxed links, aging links, maturing links, etc.
i'm pretty sure google is working hard to eliminated their deficiencies of the main index that's forcing them to have separate indicies to hide this deficiency.
the big question is how they choose which pages/domains migrate from the "sandbox" index to the main index whenever room is created due to pages/domains being dropped from the main index.
hopefully, this aggressive bot activity is an indication that they are now testing a new index!
|(for want of what to call this new index. it behaves like the supplemental index, |
No, one seemed to have agreed to seeing the effects of TSPR(msg# 18). Well here's my explanation..I hope it makes sense ;)
Reference: TSPR paper by Taher H. Haveliwala, (who now works at G).
The ODP biasing personalization vector is the cause for the sandbox. Also it is important to note the Query time importance score, which means it is necessary to be placed in the correct category in DMOZ to turn up in the SERPS.
So PR is still important, only thing is that it is diluted due to the personalization vector. One way to beat the sandbox would be to have high PR. The other sure shot way is placement in every relevant category of DMOz.
The launch of personalized search and site flavored search also point to inclusion of TSPR in the google algo. Also if you note the colored balls in front of the personalized results, are all sites which are listed in dmoz. Newer sites do not show the colored balls even though listed in dmoz.
The last complete updation of the personalization vectors seemed to have occured in early May, when many people reported coming out of the sandbox.
Since then something has changed, and I believe its the calculation of the query time importance score.
Also, another misfit is this scheme of things is that old domains dont seem to require a dmoz listing.
|which means it is necessary to be placed in the correct category in DMOZ to turn up in the SERPS. |
I don't dispute that TSPR may be in play, but I find it hard to believe that google has made all new sites dependant on DMOZ.
Although, it would be a nice way of getting people to manually check new sites (i.e. stop spam) - for nothing.
|Although, it would be a nice way of getting people to manually check new sites (i.e. stop spam) - for nothing. |
That brings up a good point. Has google perhaps thrown in the virtual towel with trying to fight spam via an algo change, and instead decided to use the free spam-fighting nature of DMOZ? It would seem to be an inexpensive (although far from perfect - as it can take literally years to get listed in DMOZ) way to add the human-review element to the algo...
Having said that, and having had our site hit in march with whatever mysterious dampening effect (sandbox, penalty box, or whatever we're calling it now), and having had a DMOZ listing for about 90 days now, I'm still waiting for whatever boost such a change would give our (commercial) site, if any.
I have a couple of sites that are currently affected by this "lag", sites that have been in DMOZ for a couple of months. So I don't think DMOZ is necessarily the magic pill you suggest.
|Remember, these sites are still in the index - they're just not ranking |
not for money terms anyway, I have new sites ranking #1 for one word search terms that are not money terms.
The sandbox is a PR flaw, nothing else - when PR is updated, loads of problems will get sorted - I think. G can do away with the visible PR bar, but never the concept of PR.
|The sandbox is a PR flaw, nothing else - when PR is updated, loads of problems will get sorted - I think. |
Explain this - how can you even speculate that PR is the cause?