Forum Moderators: open

Message Too Old, No Replies

Has the Sandbox been Abandoned?

         

phantombookman

8:54 am on Nov 23, 2004 (gmt 0)

10+ Year Member



Sorry to start a new thread but felt it may warrant it.

I have been posting in favour of the Sandbox's existence and I have 2 sites firmly stuck in the sand!

However...
2 weeks ago I registered a brand new domain and started to build a new site. I knew it would be at least 6 months before anything happened but..

This morning it entered the index for the first time - straight on page one for a one word search (a town, granted only 194,000 matches) but none the less the last 2 sites still cannot achieve similar results after 6 months.

Also preliminary early pages ranking very well
The site has only one incoming link, no adsense, banners or anything, vanilla html etc.

Built as per my last 2 sites so clearly something has changed!
Regards and hope to all
Rod

lizardx

1:44 am on Dec 16, 2004 (gmt 0)

10+ Year Member



<<< It is NOT total exclusion in my experience so those that keep saying Google has no room left! ..... that throws their theory out of the window >>>

Google has plenty of room, they can just keep adding those 2^32 bit indexes one after another like it looks like they've been doing. The trick is getting into the first one. For the math challenged here, of which there appear to be a few too many, the current index count is 2 * 2^32 roughly. It doubled overnight. They didn't find these pages overnight, the old junk in that index proves that. Do they teach math in high school anymore?

That's what's so absurd about saying there is no sandbox. Of course there is no sandbox, it's a word used to describe the effects of what they are doing. It's pretty accurate, so it caught on. The fact that some sites can escape being downgraded in no way proves that this particular algo tweak[s] doesn't exist, it proves that the thing can be bypassed in certain circumstances.

But because they clearly are having difficulty organizing the world's information nowadays, maybe it's time to change the company slogan to something like 'maximizing our IPO stock sales as the law allows us to begin selling off our ten times overvalued stocks before the market wakes up and stops valuing that junk the way it is'. Or, what it looks like when money takes over engineering as a primary force.

Or maybe they can say this: well, we liked the old results and sites we had indexed so much we decided to just stick with those and not let any new stuff in anymore, after all, everyone loves us, we have a good company name. Hmm. Lots of possibilities.

Re the overoptimization penalty:
try it, I did, deliberately. Page dropped out of serps, pr 0 from 4 or 5. De-optimized, after a while pr returned, serps returned. Over optimize at your own risk. Only that page experienced pr 0 on the site, nothing else changed. That's about as scientific as you can get with a sample of one, but it's good enough for me.

<<< It has also been repeated over and over again that this so called sandbox is not absolute. It does not, and never has, applied to all new sites. It does apply to a far greater percentage of new sites than ever, which all by itself should be very informative. IMO, a variation of it applies to older sites too...or maybe not. Hmmmm...what if it's the same rules? ;-)>>>

I've suspected the same set of rules, with the same internal requirement driving them for quite a while. Only this requirement is not an improvement, it's a hack, and a way to boost income, and maintain that income until at least the grace period passes and they can start selling their shares. This process started roughly in last november from what I can see. It increased in severity the closer to the IPO they got. Profits for the last quarter relevant to the IPO were record high. If this is confusing to you maybe it's time to take a few business classes. Thus one set of rules was used to deal with what probably was, and may still be, an internal messup, overload, etc, and to maximize income for this period. Both were reasonably successful. The reason people here are annoyed is that despite all claims otherwise, this is pretty obviously not done to improve the engineering of their product, so of course people get annoyed at how bad the product is getting. Pretty natural. If not here, where should this type of issue be discussed? None of my friends have any interest in the question, except when they notice google serps degenerate randomly of course.

caveman

2:00 am on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>it's good enough for me.

It was good enough for me shortly after Florida, when we did the same thing...though the details varied a bit.

And it's still good enough for me. Problem now is that it's a hecka lot easier to set the darned thing off.

A hecka lot easier.

And even worse, when you do set it off, you don't just get to just pop right back in after you fix it. Now that really isn't playing fair. How are we supposed to figure out the darned algo when they pull stuff like that?

:-)

eyezshine

4:34 am on Dec 16, 2004 (gmt 0)

10+ Year Member



The problem I had in the beginning was I didn't know about this new "sandbox" google had when one of my sites was sandboxed. I just thought I needed to SEO the site better so I SEO'd the site like you wouldn't believe.

So much that 6 months later when it came out of the sandbox suddenly, the traffic was so massive it crashed my server over and over again and my host shut the site down.

Then the site got instantly banned because the site was shut down. Stupid google! From one extreme to the next!

eyezshine

4:55 am on Dec 16, 2004 (gmt 0)

10+ Year Member



Did anyone notice that google's allinurl: command isn't working right anymore? Did they change something?

irishaff

5:10 am on Dec 16, 2004 (gmt 0)

10+ Year Member



replied by mistake to old post..deleted.

eyezshine

5:19 am on Dec 16, 2004 (gmt 0)

10+ Year Member



Also, I don't see anymore "Supplimental Results" anymore? Is google updating? Or are they fixing their results so we don't see what they are up to anymore? Out of sight out of mind?

Spine

5:36 am on Dec 16, 2004 (gmt 0)

10+ Year Member



I see supplementals all over the place still.

[edited by: Spine at 5:52 am (utc) on Dec. 16, 2004]

caveman

5:48 am on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



...world gone mad...

DerekH

7:23 am on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, I don't see anymore "Supplimental Results" anymore?

Two sites of mine which were URL only for the last 8 weeks have reappeared in the last 12 hours, and every page is Supplemental.
Truly bizarre!
DerekH

BeeDeeDubbleU

7:51 am on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



However it is deja vu, just change the names and the dates.
[websitepublisher.net...]
[pcworld.com...]

Minnapple that is quite spooky :)

(Everyone should check these links BTW)

BroadProspect

9:36 am on Dec 16, 2004 (gmt 0)

10+ Year Member



Simply AMAZING ARTICLES, this should be emailed to EVERYONE in google and posted on the walls!, learn from history!
/BP

BeeDeeDubbleU

11:48 am on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I wonder if GGuy would be prepared to comment on these :)

energylevel

12:31 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



lizardx ... if you comments have weight I can only summerise that Google can't keep this up for much longer cos the Yahoo competition and pending MSN search is going to increase the pressure on Google from two fronts

a)Disgruntled people who feel they've been treated harshly by the sanbox filter may concentrate more on optimisimng for Yahoo and MSN Search in turn this new focus could bring a move in the money spent on Adwords to Overture and it'll only be time before MSN Search ditches Overture for thier own model (if anyone's has heard any rumours I'd be interested to hear what people think may happen in the future)

b) Keeping good fresh content out of your index can detract from the quality of the search results emmensly, you can only keep doing this for so long if the competition is getting better all the time... Searchers could start moving in mass to other search options.

For my own personal moan, I've seen a few of my clients who are established companies have their web sites stuck in the sand box. Whilst the sites were new, the companies are well established are were just slow moving into the idea of a web site and intenet based side to their business. These would be valuable additions to the Google SEPRS but can NOT get any rank of note whilst I have to endure the biggest load of crap shopping sites (in my opinion) and the likes all the time in search results!

Take all the technical chat away a minute lets just talk common sense and logic ..... surely the guys at Google can see the things we are seeing and many have commented on it here! The money men have the power I guess now!

MHes

1:21 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Webfusion -"So, you're telling me if a highly respected researcher published a paper online that revolutionized "hydrogen fuel cells", and thousands of highly respected eductional sites linked to it of their own accord, it would not be worth ranking?"

I bet it would rank very well, new site or not. People say here that their site has good links in, the reality is that they forced the links and google can spot that.

Sandbox is a symptom, not a cause. The cure is credibility, which you can fake or naturally acquire. Eitherway, its credibility percieved by google's rules.

The AV comparison is misguided IMHO. The big factor was their page design/speed and cluttered image. The google model was far more appealing and still is.

petehall

1:30 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



lizardx ... if you comments have weight I can only summerise that Google can't keep this up for much longer cos the Yahoo competition and pending MSN search is going to increase the pressure on Google from two fronts

Have you looked at Yahoo! SERPs recently?

The recent update is yielding very good results indeed... I for one am very impressed (after months of annoying 301 redirect issues).

There is no sandbox there - they are picking up changes and ranking quite quickly.

MSN Beta is incredibly fast...

energylevel

1:50 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



was forced? define please .. do you mean paid for?

BroadProspect

3:39 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



I have been using google for the last few years most of the times, but yesterday I was looking to buy some new CD title, I figured before starting the search that it is a new content and I simply want to yahoo and looked for it (and found it)
I DID NOT EVEN TRY LOOKING IN GOOGLE! and I have the google toolbar installed and not the yahoo one.

It took me a week to mentally switch from AltaVista to google, if there is a newer,fresher (content wide) solution out there - people will use IT and stop visiting google site.

It is JUST me?
/BP

Powdork

3:47 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Sandbox is a symptom, not a cause. The cure is credibility, which you can fake or naturally acquire. Eitherway, its credibility percieved by google's rules.
I have been considering this for the last week. We have long said that the best sites don't rely on Google for their traffic, or that they would still be able to stand on their own financially without Google traffic. Much was made of this during Florida. Sort of a 'don't put your eggs in one basket' thing. Is it your opinion that Google monitors traffic to web sites and they must reach a certain level before they will be ranked according to their content relative to the competition for a search term or phrase?
If so, how would this be measured?
via the toolbar? alexa ranking? Would getting visitors to any part of the site help pull the entire domain out?

BeeDeeDubbleU

5:53 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Sandbox is a symptom, not a cause.

What's in a name? The sandbox is the sandbox. I don't think that I am alone in not being concerned with semantics. These only cloud the issue which is that there is something going on in Google (no it has not been abandoned) that prevents the vast majority of new sites from being featured in the results.

Here's a definition if you want it. When you look at this it puts it in Perspective.

Sandbox: A name that has become associated with a particular function of the Google algorithm that prevents the vast majority of new sites from ranking highly in the SERPs for an indefinite period.

Consider this hypothetical situation. Let's say that it's December last year and I start a thread suggesting that in the new year Google will attempt to block spam by preventing ALL new sites from featuring in the results. Furthermore I suggest that the media will not consider this worthy of comment.

I think most of you would have dismissed me as a nutcase ;)

caveman

6:20 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Some things never change. ;-)

renee

6:31 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



>>I think most of you would have dismissed me as a nutcase

I still think you're a nutcase ;) Sorry. I couldnt't resist. The sandbox is strictly a main index capacity issue. The sandbox:

- is not a function of Google's algorithm;
- is not an attempt by Google to fight spam;
- is not limited to new pages of new sites anymore after google announced 8B pages!
- not an indefinite period! As google removes pages/sites from the main index, it creates space for sandboxed pages/sites to move in.

the sandbox is nothing more than a secondary index separate from the main index. whenever the number of results fall below a certain threshold, google then does another query that includes the secondary (sandbox and supplemental indices).

the sandbox will go away only after google has solved its index/algorithm capacity issue.

phantombookman

6:57 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



Renee
without wishing to be controversial, almost everything in your post runs counter to my experience.

On the overall reason why there is a sandbox I do believe the purpose is to deal with spam and does so in 2 ways
disuades spammers from building sites with a short life and bombing Google.
Also buys them some time as the try and improve their algo to deal with spammy sites.

I still do not understand the technical aspects to the 'lack of capacity' argument. If they index the site and all the pages and also include them in SERPS then how does where they rank affect capacity.

steveb

7:56 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The more credibility, the more sandboxed. Zero credibility sites break out sometimes. Sites with just high quality algo ingredients are doomed. These days you can either build a new authoritative site, and get sandboxed, or build a piece of piffle, and get sandboxed 95% of the time. I have to believe that breaking the sandbox is not worth hurting your credibility (unless you have a domain with a short shelf life.)

renee

8:26 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



>>disuades spammers from building sites with a short life and bombing Google.

unfortunately the sandbox applies to all sites, spammy or not, short life or long life, google bomber or not. would google also by choice dissuade "good" sites from being built? how about all the spammy non-sandboxed sites? why would google let them exists at the expense of all the fresh, non-spammy websites?

>>Also buys them some time as the try and improve their algo to deal with spammy sites.

I agree that the sandbox zllows google to buy time until they solve their capacity issue. NOt to "try and improve their algo to deal with spammy sites." Obviously solving the capacity issue is a lot more difficult than solving the "spam" issue.

>>I still do not understand the technical aspects to the 'lack of capacity' argument. If they index the site and all the pages and also include them in SERPS then how does where they rank affect capacity.

This is very simple. Google is unable (for what ever reason - 32 bit, algorithmic, matrix size, etc) to add any more sites/pages to its main index. This first became evident when google added a separate index and called it supplemental. The reason (marketing) given by google was this will enhance the capability to include results of "weird and obscure" queries! This is no reason to create a separate index if google can accomodate all the data in one index, space or algorithmic wise. As google explained, a query is first performed using the main index and if the number of results fall below a certain threshold, google also performs a search of the secondary index and merges it with the main index query. The same is true with the sandbox. If your site/pages are not in the main index (but are indexed!) then they are either supplementals or snadboxed. You are able to rank for competitive terms if you are in the main index since google will never access the secondary index to augment the main index serps. we know this to be true since this is exactly how the supplementals work.

energylevel

8:42 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



You got me .. I thought these site were labeled supplemental in the search results, all the sites I've seen 'sandboxed' weren't labelled supplemental?

eyezshine

9:31 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



That's because there is not 2 but 3 different databases.

One is for the main results

the second is for the sandboxed/new sites/low PR sites

and the third is for the supplimental results which are pages that were there when google indexed them but the spider got a 404 or 302 or 301 error when it crawled it. Or no other pages have links to those pages anymore.

Google uses the supplimental results database so that if your site was down during their last crawl, people will still be able to find your pages. But all your pages go into the supplimental index until the next crawl.

To google this solves alot of problems they had before where if your site was down when they spidered it, you simply got dropped from the index. Now the pages get a second chance at life. If at the next crawl the pages are still down or giving errors the pages are dropped from the supplimental index completely. Or they are brought into the secondary or main index again.

Somehow they think this solves their capacity problem by creating extra databases to store sites based on PR. In reality they could create as many of these databases as they want which makes sense but if you're not in the main database you're not going to rank well.

I think we are in the beginning stages of this secondary database thing and I can see google tweaking things around.

Hopefully one day google will search both databases at the same time rather than searching one and then the other if there is no results from the main database. Combining the results from both databases and ranking accordingly would make it like there is not 2 different databases and our sites will get better rankings and traffic.

Right now it's black or white. Either you are in the main database and you get tons of traffic, or you are in the secondary/sandboxed database and get a small trickle of traffic from very obscure keywords. Or your site was down for some reason when google crawled it and you are in the supplimental database getting an even smaller trickle of traffic.

renee

10:43 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



>>You got me .. I thought these site were labeled supplemental in the search results, all the sites I've seen 'sandboxed' weren't labelled supplemental?

please see eyeshine's very clear explanation.

good job eyeshine - not very many people understand or accept what is going on!

energylevel

10:58 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



renee ... why don't you kiss my backside and find somewhere else to be condescending .. who are you, what do you know and what makes you think you can make such impertinent remarks about others in these forums ....

eyezshine

11:06 pm on Dec 16, 2004 (gmt 0)

10+ Year Member



I'm sure he didn't mean it the way it sounded. We're all just trying to figure this thing out and everyone has their theories.

You have to admit that the multiple database theory answers alot of questions and makes sense. Also it is hard to disprove too.

MHes

11:51 pm on Dec 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't believe the capacity theory. I suspect google has a team of people who look at sites that are in the top 100,000 alexa rankings or perhaps have triggered a 'manual inspect' filter. Lets say 20 people are employed, and each one looks at a site per minute for 6 hours per day....that's 360 sites per day each. Average size of site is (big sites may trigger the filter) 1000 pages. Therefore, there is a potential of all 20 people removing 7.2 million pages per day.

After a few weeks that would have an effect, would'nt it?

This 338 message thread spans 12 pages: 338