The press needs to get a whiff of the sandbox (and it stinks) for Google to do anything about it. Someone with some press contacts needs to come up with a few good examples of searches for company names that are new that Google comes back with no results while Yahoo/MSN comes up with the official site. The press would have a field day with that.
Nice post neuron. We seem to be in the same camp regarding the sandbox being a capacity issue but I want to play devil's advocate to that point of view. You seem to have a site in the sandbox since February. I put out a in February and it came out of the sandox in June. Why did some sites come out of the sandbox (all at the same time) but others sites have been sandboxed since February? If it were strictly a capacity issue, why would some sites be treated differently? As I said previously, I did nothing special to that site, I just waited and it eventually came out of the sandbox.
I think there maybe several things going on here at once and that is why things are particularily difficult to interpret.
I agree with Scarecrow's statement
|One is a capacity problem, and the other is their continuing efforts to fight spam. |
My presnt problems are consistent with most of the posts here. We changed our domain name (yeah, I know, dumb idea), and the pages at the new site have tanked in the SERPs, despite getting good PR transferred through 301 redirects. As a test, I moved one of the pages to a subdomain (same IP address), and it shot up to #2, where it remains (and where it used to be, before we changed domain names).
|Any legit site, (like my main site), can add new pages and get them ranked within a few days. You can see that from the posts above and I am working on another new page right now. |
This is how the sb works:
New sites do not rank well because neither their PR nor their anchor text is taking effect immediately. You can still build a link factory but it will take ages until it starts showing in the serps. By that time chances are that you, the spammer (just kidding), has A) given up or B) been caught otherwise.
The drawback is that legit new sites are treated uniformly as potential spammers. They will have to earn their ranking by being patient. G's bet is that a legit site with good and unique content will have the patience to wait (and wade) through the sb.
I don't know if you remember the -aasdasd thingy. For me that made it perfectly clear that there was something fishy going on.
New pages are a completely different story. New pages on non-sb'ed sites rank well. This is because the sb is not an attribute of a page, it's associated with a site and it applies to all pages of a sb'ed site.
As a sidenote: I doubt that buying existing domains is a way of getting around the sb. I myself built a new site on an existing domain a couple of months ago. I removed every old page and put in a completely new set of pages. No old URL remained valid. And guess what? I went straight into the sb. Now my toolbar PR is 5 and I have a nice set of backlinks but I'm out-ranked by crappy, stone-aged sites with virtually no appplicable content and amateurish seo. I'm even outranked by a freakin' PR0 'Links' page that links to me. The only occurrence of the keywords on that page are in the anchor text of that particular link.
see message 31
see message 55
although that doesn't really count as having press contacts. If you want examples of sandboxed sites, there not hard to find. If you are a member of, say, the chamber of commerce they will have a list of new businesses (or at least ones that joined). They usually give a web address. Check whois for the domains registration date. Then perhaps check the wayback machine. Web Design Companies may have lists of recent work. There are probably some folks around here that have lists of several or more. Perhaps the fine folks at a site like google watch or #*$! could put up a form for us to enter various information about a domain in order to aggregate some statistics. The actual domain could be optional.
Hanu -- I think there is a new filter or screening where large changes to old stable sites that are focused on increasing keyword density and SEO are penalized. I had the same things happen recently and its the only thing I can think of.
Several other things are goin on right now IMHO -- The rules for cross-linking have definately tightened up, links text is much MORE important, and links from similar themed sites is MORE important. I could be wrong and I know others disagree.
My personal theory is that Google wants everything to be natural (based on comments by Matt Cutts) -- a large number of new links that are all the same suddenly appearing, or a large number of links pointing to a site that just happens to be almost exactly the same as those pointing to another site on the same C-Block -- these are clearly spammers OR attempts to game the system and get penalized.
Just some more food for thought. . . (eZeB, I just read your post after I initially wrote mine - I agree with you.)
I don't think that it's "new domains" that are getting sandboxed. I am leaning toward the theories on sandboxing of new links (and maybe new optimization efforts).
I have a 3-year-old site that never ranked well in Google for any terms (and it didn't deserve to - very few links and less than 10 pages). I basically didn't update the site for a couple of years. Then around 3 months ago I started adding pages (the site theme stayed the same) and gradually got around 70 PR4+ on-topic links (half reciprocals).
Now it's a 60-page, textbook-optimized (Brett's guide) site with decent backlinks (35% varying anchor text, 20% deeplinks, only 3 from other sites of mine with no crosslinking). The industry is competitive, but no one is competing for my main 2-KW phrase. It ranks #2 in Yahoo and top 10 in the new MSN. However, in Google it is affected by all the symptoms described in this thread. It does bounce around wildly in Google on a daily basis for the 2-KW phrase, ranging from ~#60 to ~#150, but doesn't show up for any other 3 words in the title (or even the full title). I have links from around 25 of the sites ahead of mine for this same 2-KW phrase. None of the other sites ahead of mine have even a single inbound link with this 2-KW phrase, so by normal (albeit old) reasoning, my site is definitely being held back by some kind of dark force ;o)
So. . . the age of the site doesn't appear to be a factor in itself, but rather new links and/or optimization techniques.
|The drawback is that legit new sites are treated uniformly as potential spammers. They will have to earn their ranking by being patient. G's bet is that a legit site with good and unique content will have the patience to wait (and wade) through the sb. |
With all due respect this just does not compute. I am getting hoarse repeating that there is absolutely no logic in keeping ALL new sites out of the SERPs. If this were so then there would have been a release date of say six months. Thousands, possibly millions, of people have had sites sandboxed for up to NINE months!
Do you honestly think that Google or anyone outside the cuckoo's nest would do this deliberately? What are the plans for release? One year? Two years? I don't think so.
Remember that if we consider that the Internet is less than ten years old and growing exponentially then we are talking about perhaps as much as 15% or more of the Internet being hidden by Google. Good corporate policy? I don't think so.
Let's just hope that the media gets involved. As I said earlier this is the only chance we have of learning anything about this. Whenever any theory as controversial as this has happened in the past GoogleGuy has come along and debunked it but not this time. Doesn't that tell you something?
|So. . . the age of the site doesn't appear to be a factor in itself, but rather new links and/or optimization techniques. |
I am willing to accept that as it sounds plausible and would largely cause the same symptoms. In fact, me thinks that in a delirium I posted in another thread about how great an idea it would be to crank up the weight of anchor text and at the same time delay the effect of links.
Some people here react allergic to the term penalty. But if by penalty you mean that certain over-optimizations are ignored by the ranking algo, I am with you.
I keep asking this question when the term gets thrown around but, what is "overoptimization" and what is "spam" in this context. I am not being coy here, but these things really seem to be in the eye of the beholder. I think some of the techniques that people would generally agree are "spam" are highly effective with G in very competitive areas. We see it all the time with competition for terms with 2-8MM sites returned for the term.
So, if the so-called sandbox, is an anti-spam filter it is, um, not effective. Now, all that said, I can also think of a number of sites using what are called "spam" techniques, that are actually good, useful sites and legit businesses. I would guess their owners don't care about the arcane world of the G algo - they just think it is *gasp* advertising. And don't tell that G is about to catch up with them. Some of the more prominent examples have been in their place for 2-3 years or more.
|Remember that if we consider that the Internet is less than ten years old and growing exponentially then we are talking about perhaps as much as 15% or more of the Internet being hidden by Google. |
Making guesses like that can be dangerous. For one, the growth of the web is not from new sites only. It is from new pages on old sites, too. Secondly, the sb effect is only apparent in competitive areas. The sb is by no means hiding content. In my spare time, I run a purely hobbyist site with some obscure but useful articles. I receive quite nice traffic from Google although I have been changing domain names, url schemes like there was no tomorrow.
|Whenever any theory as controversial as this has happened in the past GoogleGuy has come along and debunked it but not this time. Doesn't that tell you something? |
Uh, wait a minute! You've managed to maneuvre yourself into a corner there. Yes, GG has debunked wrong theories in the past. The fact that he hasn't yet debunked the sb tells me that ... bingo!
this Girl loves attention and rumours
wait until bbc and cnn
Here is another question:
If one has a new page on an old site and promotes it heavily with new links what do people think would be the effect? Would it move up quickly or not all? Or something else?
So I assume the ONLY solution is to just go out and buy 1 or 2 year old domains with pagerank?
I have a personal site registered in Dec. 2003, which probably got some pagerank by January/February 2004, and I just started actually doing any optimization for it about 4-5 weeks ago. Some pages are in the top 10 for allinanchor for competitive keywords, but are nowhere to be found for their searches. I'm willing to wait exactly one more PR update, before giving up and buying an older domain to transfer to.
And here comes the irony: I added 100 new internal pages to a company site about 2 weeks ago, for a domain that was registered in April 2004. This domain had not been showing any PR anywhere for the last 3 months because it was a dynamic site, until I recently changed the homepage to a static page. After that I added the 100 new pages and within DAYS the vast majority of the pages are top 10 for their phrases while being PR0.
As they say: "Don't quit your day job"!
domain that was registered in April 2004
should state: "domain that was registered in April 2003"
|If one has a new page on an old site and promotes it heavily with new links what do people think would be the effect |
In all likelihood it will exhibit the similar behaviour as a new site, minus a month or two. But remember, if that new page happens to be a page on a CNN, Stanford kinda site, it might rank within a couple of weeks, for the old links are doing a BIG favour.
New pages on mature domains can rank well within hours. It doesn't have to be CNN or even remotely close.
|In all likelihood it will exhibit the similar behaviour as a new site, minus a month or two. |
|New pages on mature domains can rank well within hours. It doesn't have to be CNN or even remotely close |
Yes. But if the term happens to be a competitive one, even a mature domain will take time.
|Making guesses like that can be dangerous. For one, the growth of the web is not from new sites only. It is from new pages on old sites, too. |
I think claiming that it is "dangerous" is a bit strong :) This is just a speculative figure but remember that I used the term "as much as 15%". So I will stand by this.
|Uh, wait a minute! You've managed to maneuvre yourself into a corner there. Yes, GG has debunked wrong theories in the past. The fact that he hasn't yet debunked the sb tells me that ... bingo! |
"Manoeuvred myself into a corner"? I am afraid that I don't get what you mean here? This problem just happens to be one of the most significant things to happen with Google since day 1 and they haven't been able to comment. GoogleGuy has obviously been silenced on this subject. If it were just the effects of an attempted clean up he would be all over this forum like a rash, "There is no sandbox", "Have a look at our guidelines.", etc, etc.
Bring on the media ;)
I just think google is building a bigger better database that can hold more than 4 billion pages. But for now, to compete with msn they have built a supplimental results index with 4 billion pages so they can say they have 8 billion pages now.
Then they will go with that until they finish their new bigger database system with the new faster mozilla 5.0 crawler etc...
I think it's all in the works right now. Just about 3-4 weeks ago google spidered all my sites pretty deep. Even my "banned/blocked/pr0'ed/hijacked etc" websites got spidered just like they used to back in the good ol' days before all this sandbox stuff happened.
But I didn't see any of those new pages in this new 8 billion page index. As a matter of fact, nothing changed for my sites rankings or traffic from google.
So I expect to see a new update soon when they switch over to the new improved system.
What do you all think?
This has been going on since February. It doesn't take more than nine months to build a new index. Does it?
|I think claiming that it is "dangerous" is a bit strong |
Sorry for having used the word 'dangerous'. But let's not get distracted by rethorics.
|This is just a speculative figure but remember that I used the term "as much as 15%". So I will stand by this. |
What do say about my other points? You stand by your vague guess but you don't respond to my rational claims against it. You said the G sandbox hides "as much as 15%" of the web. I responded that your figure can't even be aproximately right because A) the web's growth can't only be attributed to the addition of new sites and B) the sb only affects competitive areas. What's your opinion on that? And what's your opinion on the -sdasadad effect?
|GoogleGuy has obviously been silenced on this subject. |
Has he told you so? I'd stick to the facts here. And the facts are that he (or she) has been
silent. Why the silence? We don't know. But I admit that the silence can be interpreted my way (the sb exists) or your way (the sb is a technical problem). We maneuvered you out of that corner. ;)
Anyway, I wish GG would say something ...
|What do say about my other points? You stand by your vague guess but you don't respond to my rational claims against it. |
As you say, we should not get distracted by rhetoric but this was my response ;)
|Has he told you so? I'd stick to the facts here. And the facts are that he (or she) has been |
Well, amen to that :)
I didn't read the whole thread, but let's somebody do a little test. Open let's say www.google.fi or www.google.bg or Google in a language which is not so common and search for you position on your main keywords for sites which you think are in the Sandbox. Where do you rank? My site is experiencing the following think. On www.google.fi for example(I checked that on .es, .bg and etc.) I rank #2-#4 for my main keywords, but on www.google.com, google.co.uk, google.jp.. I rank always #715-#715. My site is around 3-4 months old and I guess I am in the Sandbox too.
Please do that test and post the results.
My observation is that it is not so much websites per-say that are "sandboxed", but keywords for that site which as "sandboxed".
One of my sites is about Widgets. 6 months back we added a few new pages about Midgets :), and there is no news of them in the SERPS; yet the page is indexed, pr 7 and features under site:
However, pages added with widget keywords came into the SERPS within 2 weeks.
So, I tried an experiment. I added the word Widgets to the exiting Title of the page. Walllah, the page was soon ranking well not only for the keyword string with Widgets in it, but also for the keyword string with Midgets. Somehow adding Widgets, under which the site was previously classified got the page around the "sandbox"
This shows that Google has developed a machine-made directory of keywords and all sites are in one or more keyword categories.
It then follows, that Google is taking time to add a site to a keyword category. Once added, all the pages of the site will come under the primary results of the keyword and ranking is done using traditional factors. Pages on the site lacking the keyword category will not feature till the site is listed for the additional keyword category.
|This has been going on since February. It doesn't take more than nine months to build a new index. Does it? |
Google's current main index (and any others they may have public as supplements) is 32-bit, both in OS and hardware. If they switched to a 40-bit (5-byte) index (which would have a capacity of 256 times the current 4.2B size), then would be best off running it on 64-bit hardware and software. This would be especially true if they intend to continue adding semantic indexing and ranking features, since the size of the index causes exponentially greater calculation cycles. Moving to 64-bit would greatly reduce that processing time.
I saw an estimate somewhere about what it would cost them to go fully to 64-bit and it came to about $10M USD, including the proprietary rewrite of the software they run, but I doubt the purchase would have happened prior to the IPO (at least not for the hardware). Also, if I was working on such a thing, I think I would put my efforts into the new 64-bit index and not put any more time than necessary maintaining an old one that was just going to get thrown away as soon as the new one went online.
Did you say $10M USD? Google probably have more than that in the coffee fund :)
I have not read the whole thread here, so forgive me if I am repeating something already mentioned, but here is our take on this sandbox issue:
THERE IS NO SANDBOX!
Google introduced new algorithms in February and these algorithms are tough, tough ant-spam algorithms. They are based on lots of factors like:
How quickly the links were amassed
Quality of links
How quickly pages were increased
The bottom line is that Google has dictated that all sites in the future will not rank well unless they behave like a normal site would behave and unless they are well considered by the Internet population.
But here's the rub: Google does not have an archived history of the building up of links and pages for sites that were already in its index, so it has to start afresh just with new sites with this algorithm and give already established sites the score that they previously had (apart from of course the anti-spam algorithms it applied in February on existing sites - interlinking etc).
The result: existing sites carry on being rated well and new sites have a mountain to climb to rate well. They are not sandboxed, they are just having to beat google's algo from base zero, whereas existing sites are beating it from base 5, or 6 or whatever.
If this is the way they are going to work and make it impossible for new sites to rank, M$ will not have a problem becoming dominant. They better rethink. A lot of what is good or cool come first from webmasters, if they satrt telling everyone M$ is the stuff, it will only be a matter of time.
Who said they were going to make it impossible. Just very difficult and seriously anti-spam. I think there is a lot of nonsense written about google being broken and people are going to leave in their droves etc etc. Every time there is a change in google's algorithms half the people (the ones who lost out) think google is on the verge of bankruptcy and the other half (the winners) think it is worth a trilion dollars. Get real guys and lose that bias. It is a very very good search engine. Criticise it for specific problems with its results, not just because you are no longer there.
All IMHO of course! :)
|The bottom line is that Google has dictated that all sites in the future will not rank well unless they behave like a normal site would behave and unless they are well considered by the Internet population. |
... wait a minute!
How come "normal" sites that have been introduced since February are also missing?