Forum Moderators: Robert Charlton & goodroi
If your site is less than a year old you are likely sandboxed.
I can't believe most sites under a year's age are in some sort of penalty box. Google would be useless. So, I want to know:
1. Are all sites sandboxed, or do certain traits (like affiliate links, low content) trigger it?
2. How long does it last?
3. How variable is the duration?
4. How do you know your site is being sandboxed?
5. Does the effect taper off or is it a binary thing?
6. What gets you out of the sandbox? Is it merely time or do good links or whatever speed it up?
Thanks.
It might be that the back links are sandboxed for few months as about 3-4 months back I was adding back links regularly until I stopped as was getting nowhere.
One of my site is sandboxed and has been at 400-440 in SERPs since months. In the last 20 days I see my site moving up everyday by 5-10 places and is at 280 today. Using the -asdf filter i see the site at the top #1.
Can someone please clarify the difference between the following:
1. A site that is at 200 in the serps, when the filter ( -asasa)is applied the site goes to #1.
2. A site that is at 200 in the serps, when the filter ( -asasa)is applied the site stays at 200.
Would option 2 mean that site is not sandoxed?
[cs.toronto.edu...]
i think the "sandbox" speculation doesn't make so much sense really ...
[edited by: zirak at 2:29 pm (utc) on Jan. 28, 2005]
OK well then how do you eplain my site, which ranks in the top 10 on both Yahoo and MSN for a popular search phrase that returns 4 million plus results, but doesn't rank in the first 1000 on Google?
Now THAT makes no sense...
OK well then how do you eplain my site, which ranks in the top 10 on both Yahoo and MSN for a popular search phrase that returns 4 million plus results, but doesn't rank in the first 1000 on Google?
They all use their own methods to rank sites - there's not really any reason why the top 10 for general search phases should be the same across G, Y!, MSN, etc.
Maybe for your company name you'd hope they'd all get it right :)
"The key difference consists in the fact that we are only considering "expert" sources - pages that have been created with the specific purpose of directing people towards resources. In response to a query, we first compute a list of the most relevant experts on the query topic. Then, we identify relevant links within the selected set of experts, and follow them to identify target web pages. The targets are then ranked according to the number and relevance of non-affiliated experts that point to them. Thus, the score of a target page reflects the collective opinion of the best independent experts on the query topic."
For "broad queries" the algo kicks in acting the same way as the well known "pagerank", but now involving factors such as "expert pages selection" and "Host Affiliation". Take a close look the document, it's pretty straightforward to me.
I think it's not a very "democrathic" algo, but it definetely makes sense ...
[edited by: zirak at 2:31 pm (utc) on Jan. 28, 2005]
but if a site is ranking well on MSN and Yahoo there is no logical reason why it should not feature in the top 1000 in Google.
Very true - but if hilltop were in play and the site had bucket loads of links from "non" expert pages, it might be enough to get it up the Y! and MSN SERPs whilst not seeing much result in G. Then there's (the possible) link aging, different implementations of semantic indexing for onpage stuff, etc - I agree it's probably not logical, but it might be why.
One of my pages ranks #1 for "indoor widgets" out of over 340,000 results in G. My homepage, however, which is optimized for "widgets," is not currently within the top 1000 results out of 4,000,000 found by G. It HAS been within the top 30 on occasion, for up to a week at a time, but then it disappears again. This site is nearly two years old, but we've been adding inbound links and do have some dupe content issues. This page DOES come up #1 when I add -asdf x15 to the search query. Is it sandboxed, or is there another penalty in place do you think?
i think the "sandbox" speculation doesn't make so much sense really
Well, I won’t disagree with what you’re suggesting, i.e. the Sandbox is just a result of the “Hill Top” effect, because I don’t know what causes the sandbox. However, if that’s true then you have two radically different sets of rules going on at the same time. I have sites stuck in the sand box, and I have sites not in the sand box. All that are stuck were launched after March 2004.
I can safely say some of the sites not in the sand box would not bear out in the paper you pointed out to us as “authoritative” sites, and they rank extremely well. Don’t get me wrong, their nice sites, with good content, but I wouldn’t define them as authoritative.
So, if Hilltop is the answer, then right now if you search a key word the results you see are based upon an algorithm that combines;
All sites created prior to March 2004 without a “hilltop effect”
All sites created after March 2004 with a “hilltop effect”
To create one set of results that you see?
How do you mix, or complete the algorithmic process, when everyone is ranked on different rules?
(please; no “there’s two indexes”, the 1,000 results displayed is what you get)
I have 2 sites that seem to be stuck in this wretched thing and they're recrawled very frequently by GBot.
This is such a confusing situation...
There's a store whose owners I know well. They have a website, and have done everything "right:" store name in the title, <h1> tag, keyword density, etc.
If I search for "store name city state" I can't find it in Google. And the name of the store is pretty unique.
Whats the point?
They seem to apply the SandBox not during crawling or applying the algo for some reason. Maybe its to difficult to do. As we see with this x13 nonsens, they seem to apply this filter afterwards. I would assume because this is easier for them, or they not wanna touch the algo at all due too dangerous/difficult.
A site that old and one that is jumping around like that might be experiencing other types of penalties.
It's really tough with 302 hijacks and duplicate content issues to know because you could be experiencing a switch between your site ranking at or around #30 and some other site trading for your positioning for that keyword. Hard to say without more info but you might want to dig around through other sites and possibly track which sites fluctuate in your serps to look for patterns of appearance, disappearance. Also, do some checks with copyscape for your content and go after 'em.
I suggest all sandbox websites get together and publicise the need to add a load of -asdf to every search result to get the REALLY good sites :)
There may be some solution for this. Someone out here using FireFox? Sure!
We have this nice search box top right in FireFox. It should not be to difficult to set up a costomized SE like 'Google Unfiltered' with this 13x times -asdf. As lot of the it profs use FireFox this would spread the message to the right people too.
[mozilla.org...]
Simply cloning the Google.com and changing the query to the 13x times -asdf before the real query?
[mycroft.mozdev.org...]
Answer: So G can claim to have 8 billion pages in it's index!
As to why they're being indexed FREQUENTLY:
* those pages can still rank for less competitive terms
* lots of inbound links to the site
* looking for dupe content and other reasons to push your site down further...:-)
This hasn't changed at all, I see it week in and week out, I know very large, high page rank sites like WebmasterWorld get daily heavy spidering, the kind the sites I'm looking at always used to get.
I also know it's not due to page last modified issues, on the larger sites I do, which are all run through php so don't return last modified headers for pages unless I've set that up, which I only sometimes do. In other words, googlebot has no way to know if all the pages are new or not, but still doesn't spider it. It's the same if I do send last modified header, by the way.
This is one of the things that makes me suspect that msn may in fact start to do better than google, they do appear to have as a goal google's initial project, indexing all of the web, indexing it fast, and giving upto date results from all the web, not from only all the domains > 1 year old.
I don't think msn is that close to that goal, still can't use it for real searches, but that's typical for ms, put out anything to have market presence, then make it work sometime in the coming releases. I don't think google is very hard to beat right now I'm sorry to say. The solidity of their results that made me a loyal user have been gone for over a year, and it's only because yahoo just isn't able to completely index the web, all the web, all the pages, and msn is still too raw, that google is able to hold onto its market share.
Everytime I see google presenting a new clever idea but failing to stabilize its core business I can't help thinking of altavista, who made the same exact mistake.
Remember, if any of you hold google stock, the smart investors get out while the suckers have pushed up share values, before they drop, the suckers are left holding junk. Even if google shares get pushed up by more suckers buying into the name and the dreams, most serious investors should be cashing out now.
Had been thinking of posting along lines of 2by4's comment about Google's core business. Seen news about online books, now something about videos - big deal when index results are kind of stale.
Wonder if the attention grabbing novelties stem from some changing attitudes thanks to IPO; and if there are people within Google who're trying to push for sorting the index, maintaining the lead, getting ready for the time when there are a google and more pages on the web.
(Also, are there hints the sandbox may fizzle, or suddenly stop, around time MSN hits prime time?)
Compare that to a domain I bought nine months ago which is still not in Google, despite having at least 100 backlinks and links from very high PR pages. I think it's all chance.
Site 1 (my nature photography site):
* About 3 yrs ago – Created site, but never optimized (only 2 or 3 backlinks).
* 12/03 – Redesigned with some optimization (ranked very highly in G for many good phrases).
* 02/04 – Site included in ODP (DMOZ)
* 02/04 - Applied for inclusion in Yahoo Asia directory (eventually declined because site was business-related).
* 03-04/04 – Began acquiring inbound links from photography/art-related directories and a few web sites (anchor text would be something like “key phrase by business name”).
* 04/04 – Changed web design (visual & structural). Also moderately used keywords in file and folder naming (e.g. key-phrase.html).
* 04/04 – Site dropped from Yahoo organic results.
* 04/04 – Site not ranking for any key phrases (including business name) in Google SERPs.
* 05-06/04 – Decided to wait it out.
* 06/04 – De-optimized (primarily dumping keyword naming of files/folders).
* 07/04 – Wrote Google (received standard response indicating no penalties).
* 08-10/04 – Worked on building textual content for many pages that featured pictures but few words.
* 10/04 – Yahoo re-includes site in their organic index.
* 10/04 - Wrote Google (received standard response indicating no penalties).
* 11/04 – Site doing well in Yahoo & MSN SERPs (not in top 1000 in Google except when searching business name)
* 12/04 – Many key phrases doing better in Google SERPs, but few better than top 500. Still not in top 1000 for phrases that are most relevant to site content and are found with most frequency in the site.
* 01/05 – Same as 12/04
Site #2 (fishing charter web site for a friend):
* 04/04 – Created site (included a few obligatory links to my photo site). Hosted by same company as my site.
* 04/04 – Acquired one reciprocal backlink.
* 04-10/04 – Never did well in Google SERPs – wasn’t picked-up by Y!
* 10/04 – Removed all pages from web, leaving only a couple of placeholder pages.
* 10/10 – Placeholder pages indexed by Yahoo and MSN (also Google).
* 12/04 – Published new version of site.
* 12/04 – Almost all pages indexed by Y,G & MSN.
* 01/05 – Good SERPs in Y & MSN, but even business name does not appear in G top 1000. Domain name & business name identical, which BTW is similar to a popular industry key phrase.
As you can see, I made a few mistakes along the way. But this site is very clean and is not competing against the hot key phrases. I have tested using all of the tricks listed in this thread (e.g. –asdf), and I really feel both sites are buried in the sandbox. Neither site is spammy, or hosts adds, banners or pop-ups.
I tend to agree with siteseo’s statement in Msg #97 p10: “large influx of inbound links & massive changes to a website.” I believe a combination of these factors, with the possible inclusion of Google-bombish backlinks and whatever else, may trigger the sandbox. All of the above scream SEO.
In my mind, SEO is simply organization, and that’s the approach I took when rebuilding. If the page is about dogs, file names, titles, references, etcetera should indicate “dogs” – not cats or something really stupid like “page_1.” Although abused, this should only facilitate relevancy in the SERPs and one should be commended, not punished when site organization is honestly implemented.
Regarding duration of penalty: It’s been 10 month’s, and I have yet to see the light. I don’t believe it’s a glitch or a conspiracy either. However, it’s worse than any penalty or ban, and I believe Google is accepting this as collateral damage in their fight against SPAM. I would normally commend G for taking action against SPAM, but this is all very wrong.
After years of learning and hard work, I’ve finally built a clean, well-organized site, and Google put the squash on it. It is absolutely sickening, and I am already recommending that friends and family use Y and MSN, because if my site is out, so are many other good ones. I used to feel warm and fuzzy when thinking of Google, but it now feels more like nausea.
Obviously share your frustration.
For the last ten months google visits our site every day and takes 1000-2000 hits. Then does nothing to index what its found. Frankly its just pointless, may just as well block the googlebot for all it does.
Meanwhile we have spent over £30,000 so far on Adwords. Due to lack of support we are now investing in marketing with MSN and Yahoo instead. We wont be spending a single penny with Google now until we start seeing something in return.
No doubt google wont care less anyway, after all we are a minow, hardly going to make any difference to googles vast income, buts its the priciple of it. Why should we or any website for that matter support Google if they are not interested in even indexing your site?
Let them get on with it i say. It wont be long before the general public catch onto the fact that new modern sites are not indexed and only the same old stale results can be found on a search, ultimately they will pay the price.
Im looking in from time to time but im not holding my breath with Google. Once your in the google bin i think you could be in it for years by the looks of things!
Good luck with your other websites anyway, at least MSN and Yahoo are doing a fine job of indexing sites. They may not take as many hits a visit as Googlebot but at least what they take they do something with!
Kind Regards
RichTC
Meanwhile we have spent over £30,000 so far on Adwords. Due to lack of support we are now investing in marketing with MSN and Yahoo instead. We wont be spending a single penny with Google now until we start seeing something in return.
If you're making money with AdWords, why would you throw that money down the toilet just to send a message to the Google search team (who probably couldn't care less whether you're advertising with AdWords)?
To use an old expression, that's like biting off your nose to spite your face.