Forum Moderators: open

Message Too Old, No Replies

Why does the 'Google Lag' exist?

Trying to understand its purpose.

         

bakedjake

1:43 am on Sep 29, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I had some in-depth discussion this weekend with some friends about the sandbox. Every theory on how to beat it kept coming back to one central problem - no one is sure why it exists.

I feel very strongly that until we have a good grasp on why it exists, it will be very hard to beat.

I don't buy the explanation that it's intended to be a method of stopping spam. Why? One, there's too much collateral damage it is doing. Two, if you accept the 80/20 principle (20% of spammers are doing 80% of the spamming), and you realize that there are multiple ways already of beating the sandbox that all of those spammers are aware of, it doesn't make sense anymore.

So, why does the sandbox exist?

The most obvious effect of the sandbox is that it prevents new domains (not pages) from ranking for any relatively competitive term. So, start thinking like a search engine - what would be the benefit of this?

Rick_M

4:05 am on Oct 6, 2004 (gmt 0)

10+ Year Member



The re5earcher post was edited the original author - it appears he originally claimed it was from "inside sources" as the first reply asked who were inside sources.

I also think it was interesting that when people were asking for Googleguy to respond, Brett mentioned that the information was "mission critical" or something along those lines - insinuating there may actually be truth to it.

I may be naive, but I don't believe Googleguy would intentionally lie if the whole thing were true (I honestly believe the founders want to stick to the "do no evil" mantra). While I'm not a programmer, I don't see what the big deal is that Google has to adapt as the web grows and technology evolves. In other words, if it were true, Googleguy could have remained silent, or could have even confirmed that there is truth that the next algorithms will be dealing with limitations of a 32 bit system. Would it really have somehow hurt Google for people to know they had a capacity issue?

Interestingly, reading this re5earcher's other post about helping Google develop a better algorithm, Googleguy responded about the person's suggestion referring to how poeple could spam. One suggestion (referring to re5earchers proposed aglo) involved people buying up new domains as a way to deal with old domains being penalized.

I personally believe that the Florida update would have encouraged people to buy new domains as a way to spam the algo - and Google put a stop to it with the sandbox. I can't think of how many times I've seen people comment, when talking about penalized domains, that the solution is "get a new domain and start over" - well, the sandbox would put a stop to that - and I think that is why Google is doing it.

rfgdxm1

4:10 am on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Originally he stated that the info was from an inside source and then lated edited it to "this is just a guess"

Ahh...this explains why the person in msg #2 wrote: "Who are internal sources?" re5earcher realized he posted something he shouldn't have. However, note that re5earcher obviously doubted the believability of that source. Why else the subject of "I think google reached its ID capacity limit?" Surely no big shot at Google would leak something like this. It would have to be someone much lower. Perhaps sufficiently low down that maybe they didn't know the full truth.

rfgdxm1

4:18 am on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>Would it really have somehow hurt Google for people to know they had a capacity issue?

Yep. Admitting problems isn't the way to go with an IPO on the horizon.

>I personally believe that the Florida update would have encouraged people to buy new domains as a way to spam the algo - and Google put a stop to it with the sandbox. I can't think of how many times I've seen people comment, when talking about penalized domains, that the solution is "get a new domain and start over" - well, the sandbox would put a stop to that - and I think that is why Google is doing it.

And would be consistent with GoogleGuy denying they had a data indexing capacity problem. If GG knew the big bosses had approved the sandbox already, then he'd also know that current data indexing capacity would be more than adequate for Google's needs until such time as they could upgrade their systems.

steveb

4:38 am on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Suppose you had a 1200 page website, but for some technological reason only 1000 pages could appear on the Internet at one time. What would you do? Not publish the newest 200 pages, or not publish what you consider the 200 worst/weakest pages? If Google has a capacity problem, why not just remove all PR0 pages from the index?

This whole supplemental pages phenomenon is weird, the lag time phenomenon is weird, the backlinks and toolbar PR choices are weird. Honestly now, completely leaving aside the quality of the ranked serps, isn't just about everything Google is up to these days just plain weird? Even if you would aknowledge a capacity problem, there would seem a far better arbitrary way to deal with it than what they are doing.

Scarecrow

5:08 am on Oct 6, 2004 (gmt 0)

10+ Year Member



isn't just about everything Google is up to these days just plain weird? Even if you would aknowledge a capacity problem, there would seem a far better arbitrary way to deal with it than what they are doing.

Capacity problem + management problem + shifting priorities (from algorithms to ad revenue) = weirdness in the main index

gomer

5:14 am on Oct 6, 2004 (gmt 0)

10+ Year Member



I think the whole thing about re5earcher is very interesting, great find SlyOldDog. The information from re5earcher seems to carry weight or at least ruffle feathers. The fact that GoogleGuy even reviews the merits or problems in detail bout "Re5earcherRank" is telling. There is lots of stuff that is bantered about here at WW that GoogleGuy simply ignores.

I would like to share some observations related to the sandbox effect. I am hoping others will share similar observations - perhaps we can find out more about the sandbox by sharing observations.

From my experience, there has only been one time when sandboxed sites were allowed out of the sandbox. This occurred about May or June of this year. Below is my experience on this and why I feel this way.

I helped put out a website in about February or March of this year. This site was indexed and displayed classic sandbox behaviour. It was in the index, for example 'site:' showed all its pages but the site could not rank for anything it should easily have ranked for. For example, a search for site name put the site on the second page of results - after about 15 other sites which just linked to the sandboxed site.

The site was a small website built by hand promoting the services of a professional in local market. The site was clean, it had only on topic links in addition to a DMOZ link. The links were gained quickly but quite naturally.

At the time, the sandbox phenomenon was quite new. When I read here at WW that many others were seeing the same thing happening to their sites, I was at ease. I realized that this was not my doing but something larger at play.

Then in about May or June, (not exactly sure when), I remember reading a thread here at WW that said sites in the sandbox were allowed in. Sure enough, when I checked the site I helped put out, all was fine. The site had good rankings, basically top 10's for the search terms it was optimized for.

As I remember, there was no one saying that their sites were not allowed in at that time so I assumed that all sandbox sites were allowed in.

Do others agree that there was generally only one period in which sites were allowed in from the sandbox? If so, were all sites allowed in or just a few? Have there been other times when sites were allowed in?

If all sites were allowed in at intervals, this would tend to indicate to me that this is more of a capacity issue than it is a spam fighting issue.

BeeDeeDubbleU

7:46 am on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I may be naive, but I don't believe Googleguy would intentionally lie

"I did not have sex with that woman."

There is lots of stuff that is bantered about here at WW that GoogleGuy simply ignores.

Yes - everything lately. Didn't his absence also coincide with the appearance of the sandbox?

jaina2

7:48 am on Oct 6, 2004 (gmt 0)

10+ Year Member



Then in about May or June

I second that. It was by the second week of May that many sites came out of the sandbox. I havent seen any site come out completely since then.
Also I have not seen any site come out of it without a dmoz link.

leveldisc

8:23 am on Oct 6, 2004 (gmt 0)

10+ Year Member



A few people seem to be able to beat the lag. I would love to see an example of that, as I've yet to see it happen (well since May anyway).

steveb

8:38 am on Oct 6, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Didn't his absence also coincide with the appearance of the sandbox?"

No. You are off by about five months.

This 354 message thread spans 36 pages: 354