Welcome to WebmasterWorld Guest from 188.8.131.52
What does exist, are filters.
What opposes those filters are good techniques and "trust" - one good member recently referred me to it as "Trustrank".
An understanding of what these main filters are for, how Google applies them and the observed behaviour of Google in releasing them would be a good way for owners to better manage and refine their organic search techniques.
Maybe our good friends in the community could select a topic or several, that they have some solid experience and authority in and support it with a format that can be easily referenced. The most recent one has been largely contributed to by g1smd. Allow me to paraphrase [ and please correct me ] an example of how i think this would flow:
Duplicate Content Filter - incorrect linking
Applied: when internal links are incorrectly applied to "/index.htm" , "/default.htm" when they should all point to "/"
Effect: Unlikely to be indexed, badly suppressed results , PR applied to wrong or duplicate pages.
Time to restore : 2-3 months from when fix is applied
Evidence: WebmasterWorld webmaster reports
Duplicate Content Filter - Meta Data
Applied: when meta descriptions and titles are too similar
Effect: results show supplemental and generally suppressed
Time to restore : A matter of days according to the next few crawls
How many other filters have you observed, what are their effects , what have you done to fix the problem and what have you seen is the time to restore them?
Lets say a commercial site about widgets has sectors red, white and blue widgets etc.
In addition to "widgets" being a highly sought after keyword, ie loads of adwords it might find that "Red widget", "White Widget" and "Blue Widget" are also high adwords it may even find that "Blue Pink special widgets" also attracts loads of pay pr click interest.
In this situation i find that obviously its going to take a good period to rank for the prime kewords, but the semi prime ones are still going to take ages to rank and need lots of aged backlinks.
Even if the site is an authority on the widgets and has been for a few years but adds a new sector page about "blue widgets" ie a semi prime keyword related but not covered befors some sort of filter kicks in to ensure that the new sector pages wont rush in top ten untill some age process has been involved.
If you look at the serps in high adword areas you will find that the top ten sites listed will have had that page in google for some time. IE 18 mths plus.
Also, since the new google infastructure rolled out you dont see new sites spring up with high page rank even if they are well connected, like PR7 for example. I would say that for a site to get PR6 upwards now it needs aged backlinks on its side
Dont think so, its only more relevent in commercial areas imo. The affiliate adverts were sector specific all different but did not effect the rest of the content on the pages involved.
When the % of pages with affiliate adverts was reduced the positions were recovering - i can only go by what i noticed.
It's been obvious for about a year that just because these 302's don't show up in inurl: command in Google anymore doesn't mean the 302s have disappeared. Google just removed them from view.
I've been watching (and reporting) these sub domain spammers for the last month or so. I usually find them when searching for the domain plus a major keyword that site should be ranking for--nothing but scrapers ranking where the affected site should be.
I've checked the links in these sub domain scrapers and haven't found a link with a 302 on it yet.
Can someone explain how these spammers are 302ing sites and we can't see it?
Also, please note that assigning a high priority to all of the URLs on your site will not help you. Since the priority is relative, it is only used to select between URLs on your site; the priority of your pages will not be compared to the priority of pages on other sites.
I think we can discount this as being a factor influencing accelerated crawling, results improvement or otherwise, for all or any pages that may otherwise be stuck in limbo or a filter.
[edited by: Whitey at 7:36 am (utc) on Oct. 12, 2006]
The problem is in the 1500+ content pages.
Some pages are also PR4 (the home is PR5) but pages doesn't appears in Google results (even if I put as keyword the title exactly).
All seems regular except the meta tag Description and Keywords, always the same for these 1500+ content pages.
Why my entire site (except the home) pratically is not well indexed in Google?
I was wondering if this might better help the process of identifying "filters"
We have been in limbo for a long time and just noticed a few days ago some "weak and uncompetitive KW's" being restored to top positions, which relate to body content only.
Meta Titles and internal and backlinks have not come into play yet. So the rest appears to be still heavily filtered.
I'm not sure if this a normal indexing/results pattern ( since our duplicate content fix of 8 or so weeks ago ), or a partial release from a filter/s
This is where a page with a keyword phrase would ordinarily be in position "x" and now appears approximately 31 places below. Typically, a page phrase which was thought to be well supported and previously ranking at , say No1 , has now dropped to 31.
Several folks also claim to have identified the symptom like this :
Evidence of a -30 [ positions ] penalty
For evidence towards a -30 penalty -
Try an "almost unique" search from your site on Google that returns Less than 30 results. For any such term your site should be always BOTTOM or thereabouts for this Serp?
and many folks speak of factors using "Trust" to break the filters:
One suggestion is:
Factors that most cause this filter to exists
duplicate content caused by bad content management system
big links campaign
errors in html
keyword anchor spam
Usually, you need at least 3 of these going on big time. I HAVE heard of people recovering, and its by fixing these issues.
If your site is one big affiliate site, and has a massive duplicate anchor site map.
eg red widgets
greeny blue widgets
bluey green widgets
widgety widgida widgets
You've tripped a filter
Nippi then goes on to share his experience of a total recovery from the filters :
How to Restore a site from the -30 filter
Its difficult to quanitify "did nothing" as you can not actually "do nothing"
50 sites may have added links to you, which tipped you back over the not trusted/trusted limit.
then in in the "did little" category
a slight content change may have fixed things.
a slight update may have removed a hidden link orphaned in code you were not aware of.
yes, its being going on for years, I had a site tank for 3 months(+30) 2 years ago, issues then were.
a. added 300 recip links.
b. home page sitemap with 400 links in it, all with one word the same.
c. hiddenish text.
fixed all, added 100 more links, removed the ones where no link back, site recovered.
I am of the considered opinion though have no proof it is a breach of trust issue with Google. You turn the SEO up too far, and google thinks your are trying to game it. Some of the things done, may not even have been intentional, and likely the recent problems have come by google turning the trust filter up a notch.
I don't think its a case of having done a particular one thing to trip the filter, its an overall "you've tried to trick google" filter.
I don't believe its a manual penalty, though it may well be, as in all cases my penalties followed large link additions and email requsts(within 6 weeks anyway) and one of these could have been sent to google.
I addressed issues such as
accidental hidden links eg <a href=home.php></a>(in all cases i had these)
anchor text spam.(in all cases i had these, reptition of main keyword with different combinations, eg blue widgets.)
junk links(in all cases I had these, not off topic, just clearly not naturally added as added so many in one go.)
thin affiliates.(only in 1 case, most recent, other times, no affiliates.)
duplciate content(in all cases i had this problem, caused by faulty paging class in my cms)
I fixed the problem, and on the first two occasions, sites recovered in 3 months. 3rd occasion, I am still waiting at 80 days but keep finding more problems i did not previously see, so have not had a clean 3 month period to go by.
I suggest to anyone having problems.
1. validate your home page.
2. view source of home page, and copy code into dreamweaver. Save file, then search for ></a> to find accidental dead links.
3. Run gsitecrawler, its a great duplciate content checker.
4. get rid of your junk(y) links.
5. get rid of your thin affiliates.
6 get rid of your more than 20 x 1 word in anchors on home page.
7. have a look for anything else that you think might be tripping the "google does not trust this site" filter.
8. Don't panic and email abuse to google. it won't help.
9. Don't think your site won;t recover, it will.
10. Don't do nothing and wait for it to recover. it might, but better to clean up your act.
Some good questions Webmasters in this situation or similar situations
can ask themselves:
- Is my site providing unique and compelling content?
- Would most consumers find my site to be more useful than others in
- Am I abiding by all of Google's Webmaster Guidelines?
If the answer to all of those questions is yes, then it's wise to
doublecheck, then be patient / continue to develop your site. And
optionally file a reinclusion request, if you've fixed issues
associated with previous guideline violations.
If the answer to any of those questions is no, then... well, you have
your work cut out for you :)
One thing stands out to me - this is not one filter. It is a series of scores / penalties that need to be satisfied to release a site through results thresholds.
The old "Sandbox" filter is just the same.
Filters can be applied manually or via an algorithm
Much has been said about this by Matt Cutts in reference to "flagging sites" for manual observation. Typical triggers for this might be:
-Large site launches
-Sudden traffic increases
Site criteria that suddencly changes ie
-Duplicate content recognition
- plus others
How this gets removed is subject to the above conceptual criteria and guidelines by a Google operator. The chances are that this is a subjective assessment within some written guidelines [ IMO ]
Tedster's comments here:
Adding Value to a Site - Thin and Fat Affiliates [webmasterworld.com]
To my memory, this term began flying around with intensity around June 2005, at the time that the training guidelines document Google was giving to human evaluators was leaked.
and talked about here:
Google Training Document [webmasterworld.com]
The bottom line is [ and to state the obvious ] sites are subjected to filters being triggered by :
-Manually [ when flagged by an algorithm or possibly a SPAM report ]
Those filters are reversed by :
-Manually [ via a reinclusion request ] and a compelling representation of why the site should be released.
Statements to Google like:
- We have done everything
- Everyone says my site's fine
will likely not work
If you want to have your reinclusion request stand a better chance, it's important to restrict your communication and build it on facts [ IMO ]
We did x , y , z - be specific
Those facts must relate to the Google guidelines and Adam's comments at the top of this post - IMO
I think this is fairly simple.
The problem is in the lack of knowledge in the detail of the those "fixes" by webmasters and and the difficulty to communicate with Google's quality assurance operators at the "Reinclusion Desk" in a manner which is simple and brief . IMO