Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Nailing down the "sandbox"

How deep is the sand? Who has to play there?

         

suidas

10:51 pm on Jan 17, 2005 (gmt 0)

10+ Year Member



I've seen a lot of messages about the sandbox, but none of them are clear about how major the effect is. Recently someone responded to a why-isn't-my-site-number-one request with:

If your site is less than a year old you are likely sandboxed.

I can't believe most sites under a year's age are in some sort of penalty box. Google would be useless. So, I want to know:

1. Are all sites sandboxed, or do certain traits (like affiliate links, low content) trigger it?
2. How long does it last?
3. How variable is the duration?
4. How do you know your site is being sandboxed?
5. Does the effect taper off or is it a binary thing?
6. What gets you out of the sandbox? Is it merely time or do good links or whatever speed it up?

Thanks.

nzmatt

4:10 am on Jan 30, 2005 (gmt 0)

10+ Year Member



After years of learning and hard work, I’ve finally built a clean, well-organized site, and Google put the squash on it. It is absolutely sickening, and I am already recommending that friends and family use Y and MSN, because if my site is out, so are many other good ones. I used to feel warm and fuzzy when thinking of Google, but it now feels more like nausea.

Meanwhile we have spent over £30,000 so far on Adwords. Due to lack of support we are now investing in marketing with MSN and Yahoo instead. We wont be spending a single penny with Google now until we start seeing something in return.

I think these two quotes need repeating. They reflect how the majority of webmasters who have published a site in the last 9 months feel.

I for one also wrote to Google and stopped my significant adwords budget a few weeks ago. We are doing just fine with Yahoo and MSN – probably better as the traffic they forward seems to be better targeted and more serious users. This is probably because the traffic doesnt come from spammy click fraud adsense sites, high in the SERPS my site should be in. Now I realize I was just suckered into using Google Adwords because of the company’s good name and forced into using Awords because of the sandbox. This is no longer the case.

Google has lost it’s good name, my business and it’s awe!

Europeforvisitors, I think Google is the entity 'cutting off it's nose'! If you had a new site I'm sure you would understand.

pr0purgatory

5:11 am on Jan 30, 2005 (gmt 0)

10+ Year Member



Well, how's this for a laugh? My site ranked at #28 for "popular search phrase -asdf x 13"

I re-optimised my home page and now I rank at #10 for the same key phrase -asdf x 13!

Although my site is sandboxed, Google religiously updates it's cache of my hompage every 48 hours.

I'm hoping my site will also rank better in MSN and Yahoo now as well...

Hey, at least I'm finding a use for the sandbox :-)

akjohnny

5:32 am on Jan 30, 2005 (gmt 0)



RitchTC: My site is heavily crawled nearly every day, and I think the majority is by Google. However, my web stats do not differentiate between robots, so I’m not sure about this (any suggestions for a good web stats log for this?). G is indexing freshly updated information on my site.

However, I see a large number of 404 errors during the time I suspect Googlebot is crawling (aimlessly wandering?). This may be another mistake that led to my site’s demise. Under Google’s recommendation, “removed pages will naturally fall from the index” (roughly quoted), I did not redirect moved or deleted pages after a major redesign.

I would also like to take this opportunity to revisit the very good questions that spawned this thread, because I hope that someone will benefit from the experience of others. I am limiting my responses to questions for which I can share some experience:

Question #1: Are all sites sandboxed, or do certain traits (like affiliate links, low content) trigger it?
Response #1: See my previous post + textual content on my site is low, because it is largely image-oriented.

Question #2: How long does it last?
Response #2: 10 months to eternity?

Question #3: How variable is the duration?
Response #3: Consistent. Has anyone heard of a site climbing from the litter box?

Question #5: Does the effect taper off or is it a binary thing?
Response #5: In my case, it appears to taper to some insignificant degree. I was below 1000 in the SERPs for many months and have climbed between 70 and 500 for minor key phrases. However, for anything that is truly relevant, my site bounces somewhere between 500 and oblivion, but never better than 500. During the past 1.5 months, results for relevant phrases vary almost weekly. Irrelevant and worthless phrases have ranked the same for about 6 months (I consider this bad news!).

Question #6: What gets you out of the sandbox? Is it merely time or do good links or whatever speed it up?
Response #6: My opinion (going out on a weak limb here): Time (anyone truly invested in a domain will wait – spammers will not); slow, quality linking; no radical changes; no tricks; and hopefully, Google realizing this is a very stupid mistake.

This silly sandbox idea, or whatever it is, appears to be a classic example of blind prejudice.

akjohnny

6:14 am on Jan 30, 2005 (gmt 0)



ONE MORE THING...

I mentioned 404 errors in the above post. I've seen about 450 errors for the month of January. What if the problem is related to Googlebot's aimless wanderings (improbable hope)? All 404 errors are currently directed to a custom error page. What would happen if all 404 errors were redirected to my home URL? I'm guessing either another penalty, or considering my 404 page contains links leading to other pages, no difference.

[edited by: akjohnny at 6:43 am (utc) on Jan. 30, 2005]

thrguy185

6:32 am on Jan 30, 2005 (gmt 0)



my first post and in a sad way a relief to read this this thread. our site is e-commerce, followed all the rules, great content designed for users, etc etc. nowhere to be found on g_ogle - but do well on yahoo and msn for most keywords.

After just now reading these posts, just tried the asdfdsx13 for one of our keywords, 10.5 MILLION PAGE RESULTS and we are #2! What is so frustrating is that we wholesale our products to other high quality e-commerce sites and ALL of them show up in top 20 for OUR products and Our brand name - except for us. And oh yah - we spend a fortune with ad words.

I guess it's a relief to comiserate with others here - I had thought our basic SEO was flawed - now I see it's clearly not. However, have no clue as to why we are in the sandbox (site went live in 2003).

suidas

6:34 am on Jan 30, 2005 (gmt 0)

10+ Year Member



04/04 – Acquired one reciprocal backlink.

With one reciprocal backlink, you are surprised to be floundering? (Pun unintended.) I would think a fishing charter business ought to be able to cobble together dozens of links, reciprocal and not—bait shops, fishing rod shops, places that will cure fish or mount them for display, local business councils and tourism information, icthyological buffs, other fishing charter businesses not in direct competition, friends, family, customers, vendors. Link building will always be key in Google, and to virtually ignore it sure death in the SERPs.

akjohnny

6:54 am on Jan 30, 2005 (gmt 0)



suidas: I did not have high expectations for the fishing site prior to acquiring incoming links, and it's up to my friend, the site owner, to seek these links. The fact is, it's first-page in Y and MSN, and "floundering" in Google.

Additionally, I included the site history hoping it may be useful to others.

BTW, my photo site was page 1 when it had only a few backlinks. It seems the more I work, the deeper I dig into oblivion.

What may be of further interest, I used to have a freebie site hosted by a phone company that had zilch for incoming links. Although I did not pay as much attention back then, it ranked okay (30-100) out of the gate. It eventually received just a couple of natural links and consistently did better in the SERPs than the two sites I'm messing with now, even with lower PR. I dumped the site last year - should have kept it. I doubt natural was as a big a deal then, but I'll bet your life natural is a key factor now.

Bottom line is (pun intended), after months, the fishing site should rank better than 1000 for the biz name, even without the links. Y and MSN are okay with it. I see this as additional evidence something Gooooofy is going on.

Scarecrow

10:11 pm on Jan 30, 2005 (gmt 0)

10+ Year Member



There's an -asdf x 13 comparison tool at s-c-r-o-o-g-l-e dot org (the sun don't shine here on that name).

Powdork

10:44 pm on Jan 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Took you long enough.;)

Powdork

10:52 pm on Jan 30, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That search shows only the sites that are filtered rather than the intermingled results you get when adding -asdf*13 on Google, correct?

brixton

11:47 pm on Jan 30, 2005 (gmt 0)



5 monthes old site is on top 20 in 1.000.000 results,

brixton

11:48 pm on Jan 30, 2005 (gmt 0)



<continue from previous message>only one word keyword but no money KW

steveb

1:54 am on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Aside from the sandboxed sites, the big majority of the ones that appear are miserably godawful quality.

Also, it definitelty isn't working right in terms of the ranking of where they would be (the black number). Doing th e 13x searches on Google, all datacenters, reveals completely different results.

Scarecrow

2:23 am on Jan 31, 2005 (gmt 0)

10+ Year Member



The big difference between this situation and November 2003 is that if you search for terms that are noncommercial, such as terms having to do with cancer research, you will discover that a lot of edu, org, and gov sites are getting hit. That's a serious criticism of what Google is doing, in my opinion.

Powdork

8:11 am on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, it definitelty isn't working right in terms of the ranking of where they would be (the black number). Doing th e 13x searches on Google, all datacenters, reveals completely different results.
I may be mistaken but I believe thats because its only showing the sites that have been removed, dropped, held down by the man, etc. The 13*asdf shows the serps as if there were no filter.

Imaster

8:35 am on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Very useful tool, clearly shows the amount of good sites being excluded by Google!

BeeDeeDubbleU

9:34 am on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A very Tantalising Tool! If this is correct (can anyone prove it?) one of my sites has the potential to be top for several key phrases.

If only ...

zirak

11:32 am on Jan 31, 2005 (gmt 0)



@Randle,

your post was interesting and stimulating (sorry i am VERY late with my reply).

Quoting:
-----------------------
Well, I won’t disagree with what you’re suggesting, i.e. the Sandbox is just a result of the “Hill Top” effect, because I don’t know what causes the sandbox. However, if that’s true then you have two radically different sets of rules going on at the same time.
-----------------------

That's probably my point.

Quoting:
-----------------------
I have sites stuck in the sand box, and I have sites not in the sand box. All that are stuck were launched after March 2004.
-----------------------

I've had exactly the same experience. Maybe, as somebody suggested before, it's probably better to talk in terms of "links age" rather than in terms of "site age".

Quoting:
-----------------------
I can safely say some of the sites not in the sand box would not bear out in the paper you pointed out to us as “authoritative” sites, and they rank extremely well. Don’t get me wrong, their nice sites, with good content, but I wouldn’t define them as authoritative.
-----------------------

Are you referring to recently created sites? If so take in account that key some out-of-our-control factors such as number and quality of "experts" and "targets" can possibly play a big role. Availability or lack of them (no hilltop triggering?) can lead to totally different results for any query .

Quoting:
-----------------------
So, if Hilltop is the answer, then right now if you search a key word the results you see are based upon an algorithm that combines;

All sites created prior to March 2004 without a “hilltop effect”

All sites created after March 2004 with a “hilltop effect”
-----------------------

I have still no clue on how data can be handled depending on its "age", but evidence seems to tell this is done in some way.

Quoting:
-----------------------
How do you mix, or complete the algorithmic process, when everyone is ranked on different rules?
(please; no “there’s two indexes”, the 1,000 results displayed is what you get)
-----------------------

No mix is needed.
In my opinion we can think of 2 different algorithms:

- the original well-known algo working as usual and retrieving the "raw" results
- a second seperated algorithm EVENTUALLY kicking-in (and post-processing the original results) depending on MANY factors, possibly:

- number of available results - wether the query is "broad" or not ;)
- number and quality of expert pages
- number and quality of "targets" qualifying as "top" results pulling down other results

Ovbviously these are only my speculations, i have no clue on how really things work @google ..
Thanks for your time, regards.

BeeDeeDubbleU

12:18 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A site I launched last year on the 9th of February has been very successful for the client. I believe that this was the last of my sites to miss the sandbox. This would place the anniversary of the introduction of the sandbox probably within the next couple of weeks.

There has been some discussion about whether or not this is newsworthy and why the media seem to have avoided it like the plague. Now, with the anniversary imminent and no sign of any change to the situation perhaps we should not be using the term sandbox, which suggests that there is an escape. All the evidence so far tells us that (apart from a few who got lucky) there is NO escape. Doesn't that make it newsworthy?

NO MORE NEW SITES TO BE FEATURED IN GOOGLE RESULTS!

Since we don't know the exact date that it was introduced could be just assume that it was February 14th and we can call it the St Valentine's day massacre II?

RoySpencer

12:31 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



I've been thinking about the "sandbox" in terms of a filter, which might still explain what we are experiencing. What if Google, say back in March 2004, started penalizing sites that got too many inbound links too rapidly? This might occur with new, bigger sites, that have aggressive (but not necessarily black-hat) link building efforts.

Could this explain it?

BeeDeeDubbleU

12:58 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My own experience is that the SVDMII (St Valentine's Day Massacre II :o) does not have much to do with the rate of rink acquisition. Several sites I created for small companies that have only a few naturally acquired links have also been penalised.

siteseo

6:04 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



Addressing the question of whether or not G has two indexes...
This is *SOMEWHAT* related to topic-at-hand. I made an inquiry to our AdWords rep who passed my question on to User Support. My question was related to whether or not dupe content will make your site a "Supplemental Result." The response I received confirms that there are, in fact, two indexes:
"...supplemental sites are part of Google's auxiliary index. We're able to place fewer restraints on sites that we crawl for this auxiliary or supplemental index than sites that are crawled for our main index. For example, the number of parameters in a URL might exclude a site from being crawled for inclusion in our main index; however, it could still be crawled and added to our supplemental index.

The index in which a site is included is completely automated; there's no way you can select or change the index in which your site appears. Please be assured that the index in which a site is included does not affect its PageRank."

Interesting. Not very helpful, but interesting nonetheless.

2by4

8:51 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Siteseo, thanks. Now maybe we don't have to listen to posters saying that self evident facts are 'conspiracies' any more. Hopefully google admitting this will be considered an adequate authority for the posters who simply will not admit or look at what has become empirically obvious? Call me an optimist, I know.

So now you know. This was obvious last year, it was obvious this year.

But still it's nice to hear it from the source. And you can go beyond this: there are two indexes, a site sandboxed is probably in the secondary index, there's nothing you can do once it is sandboxed, google is still running on the old 2^32 index algo, this has not in fact been updated, it can't handle more, it's main index is full, new sites can only enter it when old sites or pages leave and make room. So that means that there's not really any more point in talking about this problem, it's now upto google to fix it. Which they will do when properly motivated. Obviously google is drawing its primary results from the primary index still, ergo the sandbox, and the results listed upto a thousand are not in fact all drawn from the same index, although it looks like it if you don't apply any thought at all, since you keep clicking 'next page' and the page looks just the same.... my logic teacher was right.

After this, it's just a matter of going back in these google forums, looking at which posters consistently refused to see or admit any flaw or problem, especially the ones who like to call such observations 'conspiracy theories' or 'wild speculation' and in the future take anything they say with large grains of salt.

BeeDeeDubbleU

9:20 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



This supplemental index is not news and it is no secret. They actually refer to it in their Adsense troubleshooting pages.

"Your pages may be displayed on the Google search results through our supplemental index. Google currently does not syndicate certain beta features, including the supplemental results index. Supplemental results are triggered on a relatively small number of queries for which Google's main index does not provide many results. Because this index is still in testing, we do not feel that it is ready to be offered to our Adsense for Search partners. "

... it's main index is full, new sites can only enter it when old sites or pages leave and make room.

So what about the sites that are getting through?

Scarecrow

9:28 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



It has been very clear, from GoogleGuy and other sources at Google, ever since August 2003, that the Supplemental Index is a separate index. However, the spin from the 'Plex has steadfastly danced around every other question regarding the index. Their vague explanation for why it exists doesn't even begin to make sense. This in itself is evidence of a capacity problem.

What I don't understand is why everything we've learned is forgotten on this forum within a few months, and has to be argued all over again. Maybe it's the churn in membership. Maybe folks don't do enough research. Maybe old wise men get tired of newbies and wander off to greener pastures.

Try a search such as site:www.webmasterworld.com supplemental separate

Or get GoogleGuy's spin: site:www.webmasterworld.com supplemental googleguy

siteseo

9:31 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



Yeah, I've pointed that out before too.

It's not only interesting to see the one thing G said might move you into the supp index (too many variables in URL), but also to note the lack of OTHER factors mentioned, especially in the context of my inquiry, which was related to duplicate content.

TaylorAtCTS

9:54 pm on Jan 31, 2005 (gmt 0)

10+ Year Member



My site has been up for almost 11 months, but in December 04' I redid everything. I think i might be stuck in the sandbox, and it hurts real bad. I get alot of business from SE's and now im having to pay per click. Now im

1) losing money on ppc

2) losing to my competitiors simply because their site is older... my competetiors website is god awful but hes #1 on google..

This hurts real bad. I'm hoping the next SERP and backlink update things will change

2by4

10:03 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



"What I don't understand is why everything we've learned is forgotten on this forum within a few months, and has to be argued all over again. Maybe it's the churn in membership. Maybe folks don't do enough research. Maybe old wise men get tired of newbies and wander off to greener pastures."

Amen brother, this tendency has gotten really very frustrating. Especially when the people who don't follow this, and steadfastly refuse to allow any type of reasoning to occur inside their brains, or to bother trying to follow any external sources of reasoning, insist on putting out streams of babble whose main source seems to be an unwavering faith that google cannot by definition do wrong, has no technical issues, again, because they can do no wrong, or, heaven help us, actually follow sound business practices.

But then just once in a while somebody like siteseo comes up with a real gem [oh, the person who told you that is probably being tracked down as we speak and being given a severe google tonguelashing for actually telling a client the truth for once] and makes reading these forums worthwhile, but why on earth does there have to be so much dross in these threads. This issue was clearly explained 2 novembers ago. One and only counter explanation was offered: google engineers laughed out loud at the idea that the google index was full. If you don't know what spin means, now you do.

beedee: why do some sites get in? Because a few people have figured out how to get them in, and bypass the initial trigger that sends your site in.

I'd also add, the initial search terms must be separated when they enter the google system, the obscure ones, the ones that are not sandboxed that is, apply to the full index set, and as some have speculated, the hilltop algo could very well be a component of what is used to assign the search term to the full index set rather than the restricted index set. Which creates the illusion that hilltop = sandbox. But again, all attempts to bypass the dual index and capacity issue in any explanation are not going to be right, which is why it's imperative that you look at the full problem set, not just part of it.

Now can we consider this issue adequately settled, finally, once and for all. And start looking at google as a real thing, with real technical issues, and real business requirements that have real affects in the world? I'm a dreamer, obviously.

siteseo, consider yourself lucky, they told you enough of the truth for you to share it with us, it doesn't take a rocket scientist to figure out the rest, 2 indexes, one primary, one for results you won't get unless you dig into the serps, either by passing the cutoff number in the main results, or by using supplemental results. All pages are indexed, but some are more indexed than others. Again, this is something we all have known for one year, but many have simply steadfastly refused to admit what this actually means. Why? I have no idea. It's almost as bad as trying to have a meaningful conversation with somebody who's politics you don't agree with. But this isn't politics, it's just a stupid search engine.

[edited by: 2by4 at 10:14 pm (utc) on Jan. 31, 2005]

Powdork

10:11 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Her is what I found interesting.
part of Google's auxiliary index.
I take that to mean that "supplemental results" are part of a larger auxiliary index that are not marked as supplemental results

index in which a site is included
Sites are, or can be, included rather than pages

does not affect its PageRank.
Being very careful not to say "does not affect its rankings."

2by4

10:29 pm on Jan 31, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



And then we can move onto the really interesting question: when will google move to a 2^40 or 2^48 bit index?

The sandbox will not end until the day they do this. And that's why it's not going away, there are 4.2 billion slots available for all non-trivial search terms to grab. A site will not get in unless a site leaves. Currently, that would mean that if all websites in the world never shutdown, no new sites would ever enter the primary index.

This means that the sandbox ends when google upgrades their systems completely. And not until then. This is why the period the sandbox seems to be lasting is getting longer and longer.

One other thing, regarding the whole 13 -skff thing, I'm going to assume that the person who identified this as a hilltop filter remover is right, and that it is in fact hilltop that determines where your query goes, the main index, or the full index set. Thus, bypassing the hilltop filter can show you your real non-sandboxed position, more or less, especially I think for less competitive sandboxed terms.

This 367 message thread spans 13 pages: 367