Forum Moderators: open
That is to say, should one be checking to see if the sites are out of the sandbox regularly or only when they know there is a major Google update? :)
Thanks
Mc
I still cannot understand how anyone would suggest that this is deliberate. Sure it stops new spam from appearing. Obviously it does because it stops new EVERYTHING from appearing.
I also cannot understand where the media are on this one? How can something as radical as this be going on at Google without media coverage? We may never find out the truth unless this starts generating adverse publicity.
I remember the thread about an Auzzy journalist being saved by Google in Iraq, coz the terrorists traced his roots to Australia. And someone said, "Thank God, that site was not Sandboxed!" Very apt and relevant.
Some of my observations on this so called sandbox from the experience I had first-hand -
- Unique Terms without any competition: rank well irrespictive of whether the site is new or old.
- Phrases that are combination of 2 or more generic words, where no one searches for those generic words together, (often the company names which are a combo of generic words, but having results in the visinity of 1m+): Old sites rank pronto, new sites take more than expected time to rank, but these are the terms the new sites rank first for. Largely the SERPs are filled with larger sites having mentioned these generic terms scattered.
- Moderatley competed for Terms: Terms that require a couple of backlinks with anchor text to rank, such as Red Widgets City, where the daily search is perhaps 5 to 20: Older sites with decent link popularity rank pronto, but newer sites with even doulble as many links as the older sites, will rank 100+.
- Competitive Terms: The biggies. Even the older sites will require further relevant link popularity, and exhibit the similar behaviour as new sites. Won't rank for couple of months, but differ in that they rank atleast within 6 months.
This gives away reasons to believe it is the links that are sandboxed, not the site as a whole. Since, older sites have already old non-sandboxed links, they are relatively faster than the new ones, which anyway have all links as new.
The topic of this thread (it was started by me btw), puts to rest any further doubts. If sandbox (of links) were not to be there, then sites would rank from oblivion to atleast <100 positions, if not top 10, within few months, one by one, since Google does a rolling update. But it is not the case. People report their sites are out of the sandbox during one major update. They rank from oblivion to some visible presence for competitive terms and from 100+ position to top 10 position all at once. They won't traverse the middle path at all. If they did, its natuarl of a rolling update.
P.S: I am not talking about sites that have bought 1000s of links in one shot, but links achieved regularly.
Google assigns PageRank according to backlinks, and each backlink is weighted according to its own PageRank. So far, so good. But what if Google, with all the data it has collected over the years, decided to plot a "natural growth in PageRank" curve for a typical, non-spammy web page? And what if they determined that spammy sites tended to exceed this natural growth pattern?
Then they could assign real PageRank (as opposed to the toolbar PageRank) only up to the point where the backlinks do not exceed this growth rate. New pages would tend to exceed this threshold if they are link-optimized; i.e., they're growing backlinks at an "unnatural" rate. Old pages, which first appeared in Google's index many months ago, might not have this problem because the threshold is determined by calculating the average over time for that page.
It would not be that much trouble for Google to save real PageRank calculations for each page they index, and build up a history for that page. GoogleGuy, on another forum, has admitted that they have internal access to all the backlinks for every page at the Googleplex, even though they choose not to show them all with the "link:" command.
It's a no-brainer to do something like this if you're serious about spam. It would be a lot more elegant than what they did one year ago with the Florida filter. The problem with Florida was that they tried to do an instant fix by suddenly plopping in a real-time filter. It didn't work well because by then the entire PageRank infrastructure behind it was already corrupted by spammers.
What I don't understand is why blogs continue to break all the rules and rank so well. Maybe they are handled separately in some sort of "freshness" equation. This might be more acceptable if those blog pages would fade in rank a lot more quickly once they appear so prominently. But the blog advantage for a typical blog page seems to go on for months or more at Google.
Imagine it's the year 2003 and you're a search engine. Imagine you want to go public next year. Imagine your algo is mainly based on link popularity but your serps are swamped by link factory spam. What do you do? You delay the effect of links.
... but only for new sites?
Why allow the existing spammers to merrily continue generating spammy new pages?
Why allow the existing spammers to merrily continue generating spammy new pages?
Existing spammers were filtered by the C-class IP address penalty or were sorted out manually, I suppose. The typical spammer built a network of sites on generic domain names, cross-linked them like crazy, enjoyed the traffic as long as possible, got caught, was penalized, moved on and started from scratch somewhere else.
I just released a new site (first in about 6 months). Pretty much expected it to do nothing for a few months based on the "sandbox" theory. Now within 2 weeks 55,000+ pages are fully indexed and ranking well in Google. The search terms are "deep" so that may explain ranking well. The domain is a new domain registered about 4 months ago. The site is pulling significant traffic (for new site).
I pretty much assumed with the latest index update anouncement they were letting just about anything in to get the numbers up.