Forum Moderators: open
That is to say, should one be checking to see if the sites are out of the sandbox regularly or only when they know there is a major Google update? :)
Thanks
Mc
I think there maybe several things going on here at once and that is why things are particularily difficult to interpret.
I agree with Scarecrow's statement
One is a capacity problem, and the other is their continuing efforts to fight spam.
Any legit site, (like my main site), can add new pages and get them ranked within a few days. You can see that from the posts above and I am working on another new page right now.
This is how the sb works:
New sites do not rank well because neither their PR nor their anchor text is taking effect immediately. You can still build a link factory but it will take ages until it starts showing in the serps. By that time chances are that you, the spammer (just kidding), has A) given up or B) been caught otherwise.
The drawback is that legit new sites are treated uniformly as potential spammers. They will have to earn their ranking by being patient. G's bet is that a legit site with good and unique content will have the patience to wait (and wade) through the sb.
I don't know if you remember the -aasdasd thingy. For me that made it perfectly clear that there was something fishy going on.
New pages are a completely different story. New pages on non-sb'ed sites rank well. This is because the sb is not an attribute of a page, it's associated with a site and it applies to all pages of a sb'ed site.
As a sidenote: I doubt that buying existing domains is a way of getting around the sb. I myself built a new site on an existing domain a couple of months ago. I removed every old page and put in a completely new set of pages. No old URL remained valid. And guess what? I went straight into the sb. Now my toolbar PR is 5 and I have a nice set of backlinks but I'm out-ranked by crappy, stone-aged sites with virtually no appplicable content and amateurish seo. I'm even outranked by a freakin' PR0 'Links' page that links to me. The only occurrence of the keywords on that page are in the anchor text of that particular link.
Several other things are goin on right now IMHO -- The rules for cross-linking have definately tightened up, links text is much MORE important, and links from similar themed sites is MORE important. I could be wrong and I know others disagree.
My personal theory is that Google wants everything to be natural (based on comments by Matt Cutts) -- a large number of new links that are all the same suddenly appearing, or a large number of links pointing to a site that just happens to be almost exactly the same as those pointing to another site on the same C-Block -- these are clearly spammers OR attempts to game the system and get penalized.
I don't think that it's "new domains" that are getting sandboxed. I am leaning toward the theories on sandboxing of new links (and maybe new optimization efforts).
I have a 3-year-old site that never ranked well in Google for any terms (and it didn't deserve to - very few links and less than 10 pages). I basically didn't update the site for a couple of years. Then around 3 months ago I started adding pages (the site theme stayed the same) and gradually got around 70 PR4+ on-topic links (half reciprocals).
Now it's a 60-page, textbook-optimized (Brett's guide) site with decent backlinks (35% varying anchor text, 20% deeplinks, only 3 from other sites of mine with no crosslinking). The industry is competitive, but no one is competing for my main 2-KW phrase. It ranks #2 in Yahoo and top 10 in the new MSN. However, in Google it is affected by all the symptoms described in this thread. It does bounce around wildly in Google on a daily basis for the 2-KW phrase, ranging from ~#60 to ~#150, but doesn't show up for any other 3 words in the title (or even the full title). I have links from around 25 of the sites ahead of mine for this same 2-KW phrase. None of the other sites ahead of mine have even a single inbound link with this 2-KW phrase, so by normal (albeit old) reasoning, my site is definitely being held back by some kind of dark force ;o)
So. . . the age of the site doesn't appear to be a factor in itself, but rather new links and/or optimization techniques.
The drawback is that legit new sites are treated uniformly as potential spammers. They will have to earn their ranking by being patient. G's bet is that a legit site with good and unique content will have the patience to wait (and wade) through the sb.
With all due respect this just does not compute. I am getting hoarse repeating that there is absolutely no logic in keeping ALL new sites out of the SERPs. If this were so then there would have been a release date of say six months. Thousands, possibly millions, of people have had sites sandboxed for up to NINE months!
Do you honestly think that Google or anyone outside the cuckoo's nest would do this deliberately? What are the plans for release? One year? Two years? I don't think so.
Remember that if we consider that the Internet is less than ten years old and growing exponentially then we are talking about perhaps as much as 15% or more of the Internet being hidden by Google. Good corporate policy? I don't think so.
Let's just hope that the media gets involved. As I said earlier this is the only chance we have of learning anything about this. Whenever any theory as controversial as this has happened in the past GoogleGuy has come along and debunked it but not this time. Doesn't that tell you something?
So. . . the age of the site doesn't appear to be a factor in itself, but rather new links and/or optimization techniques.
I am willing to accept that as it sounds plausible and would largely cause the same symptoms. In fact, me thinks that in a delirium I posted in another thread about how great an idea it would be to crank up the weight of anchor text and at the same time delay the effect of links.
eZeB,
Some people here react allergic to the term penalty. But if by penalty you mean that certain over-optimizations are ignored by the ranking algo, I am with you.
So, if the so-called sandbox, is an anti-spam filter it is, um, not effective. Now, all that said, I can also think of a number of sites using what are called "spam" techniques, that are actually good, useful sites and legit businesses. I would guess their owners don't care about the arcane world of the G algo - they just think it is *gasp* advertising. And don't tell that G is about to catch up with them. Some of the more prominent examples have been in their place for 2-3 years or more.
Remember that if we consider that the Internet is less than ten years old and growing exponentially then we are talking about perhaps as much as 15% or more of the Internet being hidden by Google.
Making guesses like that can be dangerous. For one, the growth of the web is not from new sites only. It is from new pages on old sites, too. Secondly, the sb effect is only apparent in competitive areas. The sb is by no means hiding content. In my spare time, I run a purely hobbyist site with some obscure but useful articles. I receive quite nice traffic from Google although I have been changing domain names, url schemes like there was no tomorrow.
Whenever any theory as controversial as this has happened in the past GoogleGuy has come along and debunked it but not this time. Doesn't that tell you something?
Uh, wait a minute! You've managed to maneuvre yourself into a corner there. Yes, GG has debunked wrong theories in the past. The fact that he hasn't yet debunked the sb tells me that ... bingo!
I have a personal site registered in Dec. 2003, which probably got some pagerank by January/February 2004, and I just started actually doing any optimization for it about 4-5 weeks ago. Some pages are in the top 10 for allinanchor for competitive keywords, but are nowhere to be found for their searches. I'm willing to wait exactly one more PR update, before giving up and buying an older domain to transfer to.
And here comes the irony: I added 100 new internal pages to a company site about 2 weeks ago, for a domain that was registered in April 2004. This domain had not been showing any PR anywhere for the last 3 months because it was a dynamic site, until I recently changed the homepage to a static page. After that I added the 100 new pages and within DAYS the vast majority of the pages are top 10 for their phrases while being PR0.
As they say: "Don't quit your day job"!
If one has a new page on an old site and promotes it heavily with new links what do people think would be the effect
In all likelihood it will exhibit the similar behaviour as a new site, minus a month or two. But remember, if that new page happens to be a page on a CNN, Stanford kinda site, it might rank within a couple of weeks, for the old links are doing a BIG favour.
Mc
Making guesses like that can be dangerous. For one, the growth of the web is not from new sites only. It is from new pages on old sites, too.
I think claiming that it is "dangerous" is a bit strong :) This is just a speculative figure but remember that I used the term "as much as 15%". So I will stand by this.
Uh, wait a minute! You've managed to maneuvre yourself into a corner there. Yes, GG has debunked wrong theories in the past. The fact that he hasn't yet debunked the sb tells me that ... bingo!
"Manoeuvred myself into a corner"? I am afraid that I don't get what you mean here? This problem just happens to be one of the most significant things to happen with Google since day 1 and they haven't been able to comment. GoogleGuy has obviously been silenced on this subject. If it were just the effects of an attempted clean up he would be all over this forum like a rash, "There is no sandbox", "Have a look at our guidelines.", etc, etc.
Bring on the media ;)
Then they will go with that until they finish their new bigger database system with the new faster mozilla 5.0 crawler etc...
I think it's all in the works right now. Just about 3-4 weeks ago google spidered all my sites pretty deep. Even my "banned/blocked/pr0'ed/hijacked etc" websites got spidered just like they used to back in the good ol' days before all this sandbox stuff happened.
But I didn't see any of those new pages in this new 8 billion page index. As a matter of fact, nothing changed for my sites rankings or traffic from google.
So I expect to see a new update soon when they switch over to the new improved system.
What do you all think?
I think claiming that it is "dangerous" is a bit strong
Sorry for having used the word 'dangerous'. But let's not get distracted by rethorics.
This is just a speculative figure but remember that I used the term "as much as 15%". So I will stand by this.
What do say about my other points? You stand by your vague guess but you don't respond to my rational claims against it. You said the G sandbox hides "as much as 15%" of the web. I responded that your figure can't even be aproximately right because A) the web's growth can't only be attributed to the addition of new sites and B) the sb only affects competitive areas. What's your opinion on that? And what's your opinion on the -sdasadad effect?
GoogleGuy has obviously been silenced on this subject.
Has he told you so? I'd stick to the facts here. And the facts are that he (or she) has been
silent. Why the silence? We don't know. But I admit that the silence can be interpreted my way (the sb exists) or your way (the sb is a technical problem). We maneuvered you out of that corner. ;)
Anyway, I wish GG would say something ...
What do say about my other points? You stand by your vague guess but you don't respond to my rational claims against it.
As you say, we should not get distracted by rhetoric but this was my response ;)
Has he told you so? I'd stick to the facts here. And the facts are that he (or she) has been
silent.
Well, amen to that :)
Please do that test and post the results.
Thanks.
One of my sites is about Widgets. 6 months back we added a few new pages about Midgets :), and there is no news of them in the SERPS; yet the page is indexed, pr 7 and features under site:
However, pages added with widget keywords came into the SERPS within 2 weeks.
So, I tried an experiment. I added the word Widgets to the exiting Title of the page. Walllah, the page was soon ranking well not only for the keyword string with Widgets in it, but also for the keyword string with Midgets. Somehow adding Widgets, under which the site was previously classified got the page around the "sandbox"
This shows that Google has developed a machine-made directory of keywords and all sites are in one or more keyword categories.
It then follows, that Google is taking time to add a site to a keyword category. Once added, all the pages of the site will come under the primary results of the keyword and ranking is done using traditional factors. Pages on the site lacking the keyword category will not feature till the site is listed for the additional keyword category.
This has been going on since February. It doesn't take more than nine months to build a new index. Does it?
Google's current main index (and any others they may have public as supplements) is 32-bit, both in OS and hardware. If they switched to a 40-bit (5-byte) index (which would have a capacity of 256 times the current 4.2B size), then would be best off running it on 64-bit hardware and software. This would be especially true if they intend to continue adding semantic indexing and ranking features, since the size of the index causes exponentially greater calculation cycles. Moving to 64-bit would greatly reduce that processing time.
I saw an estimate somewhere about what it would cost them to go fully to 64-bit and it came to about $10M USD, including the proprietary rewrite of the software they run, but I doubt the purchase would have happened prior to the IPO (at least not for the hardware). Also, if I was working on such a thing, I think I would put my efforts into the new 64-bit index and not put any more time than necessary maintaining an old one that was just going to get thrown away as soon as the new one went online.
THERE IS NO SANDBOX!
Google introduced new algorithms in February and these algorithms are tough, tough ant-spam algorithms. They are based on lots of factors like:
How quickly the links were amassed
Quality of links
How quickly pages were increased
Etc etc
The bottom line is that Google has dictated that all sites in the future will not rank well unless they behave like a normal site would behave and unless they are well considered by the Internet population.
But here's the rub: Google does not have an archived history of the building up of links and pages for sites that were already in its index, so it has to start afresh just with new sites with this algorithm and give already established sites the score that they previously had (apart from of course the anti-spam algorithms it applied in February on existing sites - interlinking etc).
The result: existing sites carry on being rated well and new sites have a mountain to climb to rate well. They are not sandboxed, they are just having to beat google's algo from base zero, whereas existing sites are beating it from base 5, or 6 or whatever.
All IMHO of course! :)