|Matt Cutts on the Google Sandbox|
Secrets of the sandbox revealed at Pubcon?
The existence of a new-site "sandbox" (which delays the site being ranked well for months) has been a topic of debate among SEOs.
In reply to a question from Brett Tabke, Matt said that there wasn't a sandbox, but the algorithm might affect some sites, under some circumstances, in a way that a webmaster would perceive as being sandboxed.
So, for some sites, in effect there IS a sandbox.
|Well, over to incrediBill to tell us how he launches sites... |
New web sites are like crops in a garden, you plant them, fertilize the heck out of them with keyword rich content, add water (or beer) and come back in a couple of months and see how your crops are doing.
Not sure I have any special magic but I just don't go for overkill in the early stages.
I think the old adage "everything in moderation" comes into play when side stepping Google's Spambox. Try to keep it looking reasonable, something that a first time visitor (or Google employee) wouldn't land on and go "HOLY CRAP!" and close the window. Not too many pages at first, just a few inbound and outbound links, just a few ads, no keyword stuffing, I just try to make it look like 'Joe Blows average web site' and submit it.
The only thing that I think that's important to keep an eye on after Google first indexes your site is what's in the snippets as that gives you a real good clue what Google thinks your web site is about so I search "allinurl: mydomain.com" and look at what all my indexed pages have to say and correct them if the snippets aren't what I expected.
Last site I launched a couple of months ago is a nice organic PR 5 already and doing well, no top 10 keywords yet but they're in the top 100 and moving up.
Once it moves up in the SERPs, then I start working on the keyword overkill and world SERP domination :)
After Jagger I had a site that had been completely non-ranked in Google for nearly a year. I nearly fainted when I saw it actually up there for the first time for ONE of my 2 key phrases : ie
"Nearly widget related but there may be a a few search for this"
I'm No2 and 3 for this phrase.
However the " COMPLETELY widget related and this is what most searching will be looking for, hi there!" phrase.
I'm still nowhere. So I got ranked, but, only 'partcial' it seems. ( No1 for allinanchor etc etc ). Is this the norm for most? Ranked for a phrase, or word or two but not for others?
I thought that when you were, just to coin a phrase 'released' that was it. It doesn't seem so for my site. White hat, no outward links, a year old forum with a 20 page regularly updated site round it with 2500 members and nearly 50,000 posts. The forum name is 'the' keyphrase I'd like to rank for.
I've searched but while I can find lots of posts with 'whey hey, I'm ranking finally' I haven't found any with 'whey hey I'm ranking... but not really for all the keywords I was aiming for'.
Is this the usual experience on being 'released' or 'ranking' (however you want to look at it), bearing in mind the site was unranked for nearly a year previously for ALL terms?
|Since everything I've put on the net had value perhaps that's why I've never had a site relegated to the black hole of despair everyone posts about, don't know, I'll just keep doing what I'm doing and hope to avoid this filter in the future as well. |
The three sites I have referred to in this thread all have value and original content written by myself. One of them actually offers a free service. What they all have in common is that they were heavily optimised using white hat techniques. The smart money seems to be on the sandbox being related the age of inbound links as opposed to the age of the site but the inbounds on these sites have been allowed to develop naturally. They are getting inbounds because, as I say, they do have value.
IncrediBILL, when I try the Allinurl trick I see something that is strange. The page titles are displayed properly but the snippets on all of the pages show only my "All rights reserved" and copyright notice. This is the last text to appear on each page and it's almost as though G is choosing to ignore the real page content.
Does anyone else with a sandboxed site see this? (This may be one way of knowing that your site is in the box.)
I have now tried a little experiment. I placed my keyphrase within this text. The phrase is such that it allows me to do this and still read OK. I am going to watch this over the next week or two to see what happens. I will report back.
(Sorry, I don't want to stir up THAT much controversy.)
If I were designing some sort of spamsite killer I might base it on the ratio (and total rate) of links from associated sites to the links from non-associated sites. (By "associated", I mean cross-linked, similar whois data, etc.) In other words, sites that are launched with loads of links that have probably been added by a single webmaster are likely to be identified as spam and be sandboxed.
Well, that's how I would do it and it's likely that it works very much like this. However, the problem with penalties is that they affect the innocent as well as the guilty (because algos are never perfect).
If Google's algos were adequate, penalties would not be necessary and those at Google who believe penalties are necessary are not capable of designing adequate algos. In other words, to improve the SERPS, Google needs to fire all the engineers that think penalties are a good idea.
"everything in moderation"
I also suspect the secret to The Sandbox lies therein, but it's not exactly a common trait in this sector ;-)
As part of the moderation thing, I'd probably also keep away from AS for the first while....
My 'sandbox' story. I had a site which I made as a subdirectory of a large site. It is an online book about a scientific disipline and was ranking quite well for most of the important terms I expected it to rank for. Around June 2004 I decided to move it to its own domain and SEO it as much as I could. So using the book divisions as keywords ie, booklet title, chapter, section, sub-section etc I put the keywords in the URLs, the Title, navigation bread crumbs, next , previous navigation, table of contents etc.
The new site was indexed but where I had been on page 3 for the name of the disipline I was now not in the top 1000 and similar loss of position for all the other main terms. I did get traffic from google for lots of obscure terms.
Now it could be that my SEO was just bad and I just had the normal placement I deserved but here is why I think it was a 'sandbox'.
On Feb 7 my traffic from google quadrupled and I was back up in the top 30 for many many relavent terms. Up to this day the traffic was very stable and after the 7th the traffic stayed high and showed the normal 3-5 oganic gowth I had seen before. The 'content' of the site didn't change at all, it is about a thousand pages written by an expert in the field and no changes to it were made at all.
The strange thing is I am associated with several large sites and three new sites in that 'group', started around the same time as my move, showed similar behaviour and had the same 300%-400% jump on the same day. None of them had the same keyword seo as mine did, in fact other then just good keyword titles, H1 and description tags they don't have much seo at all.
The only things we have in common is we are all on the same IP and we are very interlinked. In fact my site had run of site links in the footer of each page of the other sites. They were interlinked but only lightly.
With this latest update I have moved up about 3-7 positions for my main terms and am now on page 1 for most of the big ones.
Just checked my logs and my blog that I launched a week ago got 10 referrals from Google yesterday. It has zero IBLs. I didn't even know it had been indexed.
My money is on the slow growth aspect. Launch with a few pages, send in a link or two. Add a few pages add another link and so on.
Most people previously didn't want to do this because they wanted to rank as soon as possible. Now, for google at least, you aren't going to rank for a while anyways if you push it so there's no reason not to slow grow.
|I registered new site 6 days ago, put halve a page of content and put a link from a PR3 page. Today I can find my site in the SERPS of Google. |
My experience with this penalty whatever it is, is that it doesn't always strike right away. You can have days or even a couple of weeks of rankings before being chucked out of the pile.
I have launched two sites in last 9 months and both were in the SERPS within a couple of weeks. The so called sandbox has never touched me. How come?
I would agree with IncrediBILL on this one. Both sites were launched with no more than 10-15 pages and just left to "bed in". Came back to each after those 2 months and added another 20 pages. And so on.
Too many webmasters launch 1000-page sites, agressively inter-link with existing (interlinked) sites and wonder why they disappear into the "sandbox".
I have an established site (6 years). Some pages I have added seem to stay in the wilderness for many months. That's why I was asking earlier if whatever this inadvertent "thing" is that Matt Cutts was talking about could apply to pages as well as sites as a whole.
Other than that, I have never had a new site go into any sort of quarantine - several this year.
BeeDeeDubbleU, a couple of months ago you and I had an argument about whether the sb was a feature or a bug, remember? Now, according to MC it is a feature of the algo, deliberately implemented. Yet you keep bragging about being Mr. Sandbox, having known it from the beginning and such. I wish you wouldn't have such a short memory ...
I find it funny that you have the nay-sayers loudly proclaiming that "there is no sandbox ... it's just a matter of too many competing for too few spots ... why don't you just get it?" while the yay-sayers just as loudly proclaim that "there is indeed a sandbox ... I know, 'cause I'm in it, and don't you dare question me you unexperienced fool you"
If you are in the sandbox -- enjoy your stay, and play while you can
If you're not -- try to get in it, because obviously "the others" have more fun
Regardless of whether you think the sandbox exists or not, and whether that so-called sandbox is indeed a penalty applied by Google or just an effect of circumstance, or whether you think the sandbox is pure bunk, and that there is a reasonable explanation to the effects explained away as a "sandbox" -- live and learn. We know the effects can be avoided, or at least limited, so that should be your goal ... not trying to find evidence to the sandbox' existance or lack thereof. There is no point in arguing whether there is an actual sandbox, or whether the ill effects that many call the sandbox are simply resulting from something else. Instead you should focus on getting your site to rank in the shortest amount of time. If you have found yourself "sandboxed" you have obviously run into something you should not have done. Now try avoiding it the next time.
This whole focus on whether the sandbox exists or not is a waste of time. Discussing the effects without discussing what results in those effects is a waste of time. It's like talking about different results of a car accident instead of going out there and actively avoiding an accident. One way gets you somewhere, the other doesn't ... You pick.
And welcome to the sandbox! Bring your own toys!
"I was at the Q&A and listened to Matt's response. The part that I thought was interesting was that Matt said when they (Google) first started hearing about the "sandbox" as the term is used by webmasters they had to look at their algo to see what was causing it and then look at the sites it was affecting. Once they studied it, they decided they liked what it was doing." - Idaho Message 21
According to this statement the Sandbox effect is an unintended feature. I would conclude it is a bug.
It is outside Googles design specifications. Therefore if you put up an "organic" Website based on Googles Website design guides (specifications) - as people here have suggested you should be fine.
I need to check if my server supports "If-Modified-Since Http header". Probably does but I have not verified it yet.
I use a content management system. Could be trouble there...
I know I am NOT W3C compliant because of the managment system I use.
So I know that this bug could look at a combination or combinations of these things and spit out my sites or put them in the sandbox as we say here.
In my opinion, before we complain about the Sandbox effect we need to go back and satisfy all the requirements layed out in the Google Webmaster Guidelines.
Having said all that, there may be issues not addressed by the guidelines, but you have to start somewhere...
"I have launched two sites in last 9 months and both were in the SERPS within a couple of weeks. The so called sandbox has never touched me. How come?"
It's no coincidence that most people who post here that there is no sandbox also post stuff like this that shows they don't even know what the sandbox is.
Sites can get into the serps the very first day they are online, which comments not at all about the sandbox. You might as well say there is no sandbox because you watch TV.
|It's no coincidence that most people who post here that there is no sandbox also post stuff like this that shows they don't even know what the sandbox is. |
The problem is that even the most seasoned of SEO's can't agree on what the sandbox is. The term is being used as synonymous with "penalty" and no one denies there are various penalties that exist. If it can't be defined then it can't really be proven or disproven to exist.
I've talked to the Google engineers at length about the sanbox, and everything I've heard has been consistent from day one. It is not 'a thing' it is a combination of things, a combination of different algo's. Describing these algo's using such an undefined blanket term is pointless. That doesn't mean that I think it doesn't exist, but it is more complex than the just saying a site is sandboxed or not.
"It is not 'a thing' it is a combination of things, a combination of different algo's."
LOL, of course "it" is not a "thing". Of course it is a "combination" of different "algos". That description basically fits any event that has ever been named in the google search systems. I can see where google is going with this method: nothing exists, everything is 'just a combination of algo components', that's how they will answer every meaningful question given to them. And they won't ever be lying.
Webmasters are the ones who noted this behavior and named it, what they noted was of course 'a combination of different algo functions', that's what we always note here, and give names to. Give yourselves more credit.
We're all composed of matter, and matter is energy, so nothing exists per se, it's all just various combinations of electrons and protons and various other little things. Next question please.
By the way, did you know that Windows 2000 will be a fully secure Windows? I read that repeatedly, microsoft guys said that all the time, that's how I know it's true. I even had a friend who worked there, he told me that same thing.... common, please.
Happy to see drdoc slipping in to wet his toes in seo land, have to start somewhere I guess.
I don't know what the sandbox is, other than hearing about a perceived (negative to some webmasters and positive to Google) symptom of something that Google have apparently admitted they are partly responsible for... hic
What I would like to know is whether this thing which Google are talking about (or at least Matt Cutts) affects whole sites only, or whether it has been thought by webmasters to apply to new pages on some established and otherwise well-performing sites as well. After all, if it is an algo glitch that was originally unintended, it could perhaps glitch pages too.
"Too many webmasters launch 1000-page sites, agressively inter-link with existing (interlinked) sites and wonder why they disappear into the "sandbox"."
-Google sandbox issues aside, what's wrong with launching a 1,000 page site? Sure if it's 1,000 pages of junk it shouldn't rank when it's launched or 6-8 months later either. But when did launching a large site become a bad thing? Oh that's right, as soon as Google's algo didn't like it.
-If the site is doing massive bad linking it also shouldn't rank when it launches OR 6-8 months later when it comes out of the sandbox so to speak.
That's the rub. Many of these sites are ranking just fine with no ill effects months later. If what they were doing was so bad, why would the "sandbox" effect stop?
If the method you use to build a site happens to bypass the effects of the sandbox that's great. But don't assume that anyone that has or is feeling the effects is a poor designer or cheating SEO. Or that your method is superior in any way other than happening to line up well with Google's algo.
|…affect some sites, under some circumstances… |
‘Some sites’ = larger sites which have quickly gathered inbound links – those that would have ranked well immediately in Google had there been no sandbox, as they do in MSN)
‘Some circumstances’ = competitive money industries & keywords.
The Google Sandbox is a feature of the algorithm which in effect promotes the likelihood of new sites or businesses, which have spent money or expertise on a quality site, also spending money on Google Adwords. They are initially starved of traffic despite the quality of the site. It is a clever and reasonable strategy to introduce new players/sites/businesses to Google advertising and to get them hooked into its culture and ongoing dependency.
If you launch a small site without many links in an uncompetitive (or money industry/keywords) you will not be targeted for the Sandbox by the Google algorithm because you more than likely don’t have any money to spend on Adwords and won’t be making money with your site to spend on Adwords. These are launched by the “there is no sandbox” webmasters.
The Sandbox is a clever business strategy and algorithm automation and streamlining at its best.
The big question is: will a Sandboxed site come out quicker if its webmaster indulges Google with Adwords income, or if he doesn’t touch it and acts poor…?
|Yes there is a sandbox and there has been one since spring 2004 but someone will come along here and refute this again. It always happens :) |
No there has never been a sandbox but someone will come along here and refute this again. It always happens :)
That's right boys and girls, let's all play nice and not launch any large profit-motivated sites because Mr. Google says that's rude. And as we all know, the self-proclaimed organizer of the world's information knows right from wrong better than we do.
For those that assume that the box is solely inhabited by scrapers and link cheats, I can tell you that I have never built a scraper site, never purchased a link, and start my sites with less than a dozen targeted links. They all end up in the box. And thankfully, they all come out--so far.
Google has some sort of an anti-sporadic growth filter in place. They have developed a model as to what the "natural" growth rate of a site should be and heaven help anyone who strays from it. That's why you don't have to do anything to get out of the box, just sit there and wait for the model and your site to match up.
This differentiates it from other penalties. Like the common cold, once you've got it there is no cure, all you can do is wait it out. One day you're grabbing your ankles and the next day you're the king of the world.
I still get a kick out of the fact that the mere mention of it puts marbles in Matt's mouth.
Disclaimer : only Google knows how it works.
We need to know the real reason of the sandbox to avoid it.
According to Idaho msg #31
|Matt said when they (Google) first started hearing about the "sandbox" as the term is used by webmasters they had to look at their algo to see what was causing it and then look at the sites it was affecting. |
Guess. The special more complex algorithm to check the sites that deviate form the average behavior was slow and it caused the delay. Now it is still slow sometimes.
|Once they studied it, they decided they liked what it was doing. |
Guess. They decided in *some cases* to add the delay (sandbox) explicitly as the way to check whether the site is as good as it appears or it is just over optimized. Then the history patent appeared.
According rogerd (msg #1)
|Matt said that there wasn't a sandbox, but the algorithm might affect some sites, under some circumstances, in a way that a webmaster would perceive as being sandboxed. |
Guess: some sites and some circumstances means that the affected sites deviate from the average behavior of the similar sites. In particular: grow too fast, get links too fast, have too much targeted keywords etc.
Sandbox is because the algorithm that separates over optimization from the real value is complex and slow. However in some cases the algorithm intentionally involves the time (history) to estimate the real site value. This complex algorithm or rather algorithms are applied only to the sites that deviate from the average behavior.
Sandbox is not a bug. It is a feature.
OK ... lets get this sorted, because I for one am SICK of hearing people say there is NO sandbox, let say first the term we've come to use isn't entirely accurate but basically the result IS a new site DOES get included in Google's index but does NOT get any worth while rank for even the less competitive terms, now if you can SAY YES to all these questions below then you have avoided the sandbox.
1) NEW SITE WAS LAUNCHED
2) SITE WAS IN A COMMERCIAL AREA WHERE PLENTY OF MONEY CAN BE MADE, SECTORS SUCH AS PERSONAL OR COMMERCIAL FINANCE, CAR SALES, PRIVATE AND COMMERCIAL PROPERTY SALES ETC ETC...
3) SITE WAS RANKING HIGH ON GOOGLE FOR WORTH WHILE TERMS AND GENERATING LEADS & SALES VERY QUICKLY PURELY FROM ORGANIC (FREE RESULTS).
Please post here of you answered YES, YES and YES to these ... or you could send me your CV!
<<...and start my sites with less than a dozen targeted links>>
Where did those links come from? Recips? One ways? Sites you own?
|Where did those links come from? |
Yahoo Directory (I insist that my clients cough up the $300)
Targeted established industry sites that I contact for a straight link exchange or in some cases provide content for usually in the form of an article or white paper in return for a link.
My own personal site which includes a list of sites I have designed.
DMOZ (But this link is never in place for the launch) as they are slower than a fat lady in the ice cream aisle.
Nice post, Vadim.
Sparkys_Dad, once again you're making complete sense, are you sure you didn't just drift in here by accident, and are now lost and looking for your way out?
It doesn't seem to matter how often someone posts something like you did, it just won't register in some people's minds, there's some opposition to empirical evidence in favor of favored unfounded beliefs.
Every major sandbox thread has had a posting like you made, but I think some people just don't have very good memories, and with each new sandbox thread, somebody will inevitably post that your experiences don't exist because they've never experienced them personally. Gets kind of boring, but there it is.
"some sites and some circumstances means that the affected sites deviate from the average behavior of the similar sites."
Good points vadim, it's the 'average behavior' that is actually in question. The average for most webmasters is to get sandboxed. And as you can see from sparkys, it's not from a bombardment of backlinks, it's something else. Because it doesn't always happen, some people conclude that it never happens. Google knows it happens, they like it, and have kept it. Maybe they didn't do it on purpose, but they don't deny that it happens.
Following on from Webpixie's comments at msg 110.
The idea that a site is a possible candidate for sandboxing simply because a conscientious webmaster actually completed it prior to launch .... no-one in the their right mind could possibly be using that as a spam signal.
Could they? If they are they should be ashamed of themselves.
Since when do you penalise the concept of doing a job well, taking pride in the completion and handover, consideration of the enjoyment/value for the viewer etc.
If true, this idea that you only launch a few pages to avoid the sandbox is a total nonsense and IMO just further confirms that the internet is increasing becoming collateral cannon fodder in the war between Google and the spammers.
Books are not published with some chapters missing ("call back in 6 weeks for the missing bits")
New cars aren't delivered with 2 wheels missing (what, you want all 4 wheels on delivery?")
So if a customers expectations are that they will get the full product, why should a website be any different? IMO, if any sites should be kept from the viewing public, it's the incomplete ones, not the ones where webmasters/owners have put in the time and effort to deliver a finished product.
|The idea that a site is a possible candidate for sandboxing simply because a conscientious webmaster actually completed it prior to launch .... no-one in the their right mind could possibly be using that as a spam signal. |
Yesterday I submitted a new, personal and incomplete site to Google via their submission form and sitemap. It currently has only 14 of the planned 200+ pages (100% my own original content), 0(!) inbound links (although many are planned), no adsense (although it is planned), no affilliate links (although I will add them eventually). Presently it is something like a book that cuts off after chapter 1. My apologies to anyone that runs into it.
This is all to see if it is possible for me to avoid the dreaded but non-existent sandbox in a highly commercialized and very competitive niche.
2by4--thanks much. It's responses like yours that will keep me contributing as I had previously only lurked here.
Just thought i would add my 2p worth to this interesting debate .
Sandboxing exists, a definition is simply that its a certain set of conditions that exists that googles algo doesnt like , they have admitted it , the just dont call it that.
4 years at number one on my chosen search terms i
drove myself to tears after the florida update (nov 14th 2003? sticks in my mind like someone stabbing me in the heart ) ..... I had been Sandboxed**
3 months later i gave up SEO. I had went from number 1 on 5 search terms to nowhere to be seen .
To me it was too much of TOO MUCH of a coincidence that google were floating in first quarter of 2004. I boycotted adwords out of principle and looked elsewhere to generate sales . Suppose i had become bloated and lazy anyway with the growth of leads from google . I still reckon to this day florida was nothing to do with quality of results and ALL to do with adwords revenue , its common sense for google and we would all do the same , it was big bucks on the line and it would be crazy for them not to .
Anyway , mid summer 2004 , still nowhere to be seen in google I caved in and signed up for adwords.
Within weeks i was back visable and floating around page 4 for all the keywords.This must say somehting about sandboxing.
Currently back in top 3 , and have been since jan 2005 .......... whats been th biggest change in that time . INBOUNDS .
I have never actively persued inbounds and dont have many what I would call genuine inbounds , at time of florida i would say i was lucky if i had 10 .
Anyway i am now showing 631 , most of them spurious adsense or adwords ads on peoples pages , still got only 10-15 genuine real website links .
My thoughts are that if you make google some money they will reward you with better positions .Maybe never put you in the sandbox , maybe take you out .
The " quality " inbound also just doesnt wash with me, making them some money does.
**In other words its possible that the algo treats adwords inbounds with slightly higher scoring. *
The higher you score of course the less chance you have of being caught in the quicksand .
I would proffer that googles algo has mnay sandboxes ...
non organic growth sandbox
not giving us money sandbox etc
A picture tells a thousand words and i like to think of googles algos like a game of snakes and ladders except we are all blindfolded and fumbling about trying to find the ladders and avoid the snakes .The samkes by the way live in sandpits , hard to get outta sometimes.
I am quite new to this site has anybody been compiling a list of the ladders,snakes,sandpits that work/dont work?
paul, some of your reasoning I have to agree with, anyone who thinks that the boost in adwords income pre ipo was a pure coincidence is a pure naive little lamb as far as I'm concerned, but I do have a very good quality used bridge to offer for a very reasonable price, it's located close to downtown manhatten, and I'll be accepting all serious offers over the next 3 weeks. For everyone else, of course this wasn't an accident. And of course google guy or matt cutts won't mention it, they have or had stock options that they could then exercise very profitably after that ipo, but if people want to keep living in fantasy worlds feel free. Basically you are suggesting that google won't do what you would do, because they are saints.
However, despite this agreement, I've seen sites come out of the sandbox with no adwords at all, it's just something that happens. beedee if I recall has also seen this, as have others, it's just like one day you aren't ranking except for allin searches, the next day your traffic is up.
One possibility is simply google tracking of user behavior by click throughs on links on google.com, an adwords link may count in that game, I hadn't thought of that, but no reason why not, the searcher clicks on your adwords thing, stays on your site, google decides your site is legitimate, and relevant, after all, despite having no real backlinks.
Your issue is lack of quality backlinks, pure and simple. Maybe adwords helped, maybe it didn't, but if you have no quality backlinks, google doesn't consider your site to be quality, that's very simple.
Authority or hub status only comes from authority or hub links, trustrank is the game, in competitive terms, get it and rank. Of course, now we'll have to sit back and watch all those guys whose sites don't have trustrank tell us it doesn't exist because they don't have it....