| This 195 message thread spans 7 pages: < < 195 ( 1 2 3 4 5  7 ) > > || |
|Google's 950 Penalty - Part 9|
< Continued from: [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >
|That's because we are shooting in the dark |
We really aren't shooting in the dark. We are shooting at dusk. We can see a fuzzy image of what is out there. Sometimes when we shoot we hit the target and other times we can't no matter how hard we try.
But we waste our time when we start shooting at theories like
- Google is doing this so more people will pay for AdWords
- Google only hits commercial sites
- If you have Google analytics you will sorry
- Only sites doing something illegal are hit by -950
- It's because you have AdSense on your site
- Scraper sites are doing this to us
It goes on and on.
Is it because the phrase based theories are not an easy answer? It does take a lot of work to figure out why you might have been 950ed and sometimes you just can't find the answer. But I still believe that most 950ed pages have been caught in an imperfect phrase based filter.
[edited by: tedster at 9:14 pm (utc) on Feb. 27, 2008]
So all pages have to be like Wikipedia, but not too much like Wikipedia unless you are one particular company that gets hardlinked?
"Prior to modifiying the text, it contained what one might consider to be a sufficient amount of synomymn and appropriate usage vocabulary..."
That certainly can get you in trouble. It seems though some people are getting the wrong idea. You shouldn't be thinking in terms of "deoptimizing", but rather thinking in terms of "deoptimizing for multiple related words and phrases".
Things like using "Webmasterworld.com" and "Webmaster World" on the same page especially with both in anchor text are dangerous.
Completely natural is getting penalized along with those pretending to be natural.
|Aside from scraping and cloaking to regain the ranking of a site, there must be a legitimate way that naturally written content can come back. |
The boilerplate man I am watching seems to get first 950ed and then pops back.
25 keywords in Keyword tag
Definition tag: Wikipedia snippet
SAME boilerplate bla
added to that ebay pop ups and non adsense textlinks.
If the 950 filter is phrase based why first penalise him and the pop him out again. The Wikipedia similarity seems to save him from the 950. Everything besides the short WP snippets and the miniscule foruminserts cries spam SEO non user fiendly on that page.
Unless the filter is as such that everyboy gets 950ed and then they get reranked based on what ever is en vogue today in the Wikinountain View.
But many sites don't get 950ed so, I dunno. My english site increases in traffic the German has dropped again. Besides language PR the main difference is that I don't work on it.
I’ve read and taken part in this thread with interest but its now becoming a wild guessing game.
Let’s just take this down to the absolute basics and start with the obvious facts:-
1. Google doesn’t like its serps gamed in any way
2. Google is in business to make money
3. Google still uses Page Rank (as part of its formula) to determine value
4. Google doesn’t like paid links that could game its search
5. Google has introduced ever flux
6. Google now includes some kind of Trust Rank Formula
7. We know for a fact that numerous different aspects can trip the 950 penalty
The point of reminding all of the above basics is because IF you say reduce a pages keyword density, title tag, H1 tags etc etc it could look to Google that it is possibly being gamed, ie the page is being tweaked to rank better in the serps rather than being a page about the subject that is to benefit the end user only and should rank high naturally.
Tweaking the page may well make the page rank better (in normal circumstances) by being better optimised or de-optimised BUT as a result of doing so the page loses trust rank which now kicks in and trips the 950.
If a site tweaks loads of pages then the whole site could lose its Trust Rank?.
Recently I tested a few things on a few different sites and noticed that where a page was “Perfected” it was either falling into the 950 for a period of time or was ranking 30+ rather than 1-10 where the page should be, ie out of one filter and caught by another?.
Also, we do know that some sites that have done nothing have come out the 950+ for some key words which again makes me think that they have earned more "trust" by not being tweaked and this in turn resulted in the coming back?. This has to be relevent to sites that are on the "Edge" of the 950 filter trip?
So to conclude, I think that a lot of this is down to the introduction of “Trust Rank” and as google recently devalued a number of sites “page rank” levels -1, this in turn could have resulted in a sites automatic loss of some of its “trust rank” esp if the site was say on the edge of being a PR4/5, PR5/6, PR6/7 etc etc and that coupled with some other known seo factors was enough to trip the 950 filter when Page Rank levels were adjusted (probably back in November).
In other words the sites lost some “trust” following the change in “Page Rank” formula and by the webmaster making an immediate change to the sites pages for seo reasons ensured that its “trust rank” level remained low in the short term, keeping it back in the serps as a result once all the other algo factors were slowly phased in.
None of us have a perfect answer to this 950 Penalty but “Trust” must play a part in it from what im seeing.
No, loss of trust is one thing not involved in 950.
Two common pages hit by 950 are powerful, genuine pages on domains, and totally fake/hacked pages on very trusted domains.
Lack of trust will just get you a lack of scoring, not a penalty. High scores cause problems, meaning essentially that appearance of significant trust is something that raises a 950 red flag.
For all the non-believers out there, here's an example just how psychotic this penalty is.
I have one directory containing 200 pages.
Of those 200 pages, 60 rank #1 and 140 rank #950.
Read that again: 100% of that directory's pages rank either #1 or #950 --nothing in between.
There is no difference in the site linkage of those pages.
There is no difference in the inbound linkage of those pages.
There is no difference in the outbound linkage of those pages.
There is no difference in the PR of those pages.
There is no difference in the low-level on-page stuff <h1>, <b>, etc...
There are no recips.
There are no "links" pages.
There are no bought links.
There are no paid reviews.
There are no blog-spam links.
There are no guestbook-spam links.
The site has been online since 400 baud modems were hot.
Yes. The re-ranking filter is definitely either too tight, or is being based on incorrect data, which causes otherwise legit sites to get slapped.
I agree with you on the trust issue; sites don't seem to be running into this problem until they are really trusted...which is the scariest part. I still don't think I'm describing the de-optimization correctly; the prior form of the text was written by a copywriter (as all our sites are) and looked very natural (it was about KW1, but made use of other phrases one might expect to see on such a page). The de-optimization simply removed most references to KW1 and those expected phrases, in an attempt to see if it altogether removed the re-ranking [my theory there was that if the text was largely the same, but the focus was no longer to rank on a certain phrase, the site might go from 950 to say around 200-300...it didn't budge at all though]. The next test for that particular site is again content based, but to a more interesting level. We're pulling through 1000 sites for that query and doing a 1st order co-occurrence, making sure to score which sites used which phrases, and will play with the content to remove any phrase NOT mentioned, as well as include some that might not have (such as negative points of view).
One thing that I thought of this morning that would throw a wrench into the re-ranking viability is the notion of intent. Informational sites often look very different from sales sites, with a wide spectrum in between; the same goes for pro and con sites of certain phrases. How can an algorithm expect a site promotion Widgets display content on "Why you shouldn't but a Widget"? Just a thought.
Trust rank in relation to how big / desirable the keywords are?
In your case JK is it not possible that the directory pages that are 950 are to more desirable keywords? Ie your pages need a higher level of “trust” to now rank for these keywords?.
I make this comment because there has to be some reason why the algo would trip certain pages into the 950. Im convinced despite some of the sceptics here that this filter is all about increasing revenues by making more sites purchase adwords.
The fact that your directory features for some areas and not others has to be relevant to the keywords you are trying to target
|In your case JK is it not possible that the directory pages that are 950 are to more desirable keywords? Ie your pages need a higher level of ï¿½trustï¿½ to now rank for these keywords?. |
This was fact with my domain which was 950ed.
My site came back with all keywords as if never anything was wrong with it (regained former positions for ten days now).
I removed a deeplink from a forum (now the link ist pointing to the domain instead, and the anchortext ist similar to the one used by DMOZ), I added some new pages...
That's all. I can only guess if these changes were responsible for the returning of my site, since it came back two times before without changes. I'm even not too sure, that it won't disappear again, since this has happened too.
Re-ranking may be the mechanism that sends a URL to 950, but it seems to me that the same mechanism could be used for a variety of purported "sins".
We have 9 threads worth of discussion here with minimal progress. That may be due to thinking that this is "a penalty" when it's really "a way to penalize" -- meaning there are many -950 penalties, and what fixes one situation can't touch another situation because the trigger is some other factor.
|sites don't seem to be running into this problem until they are really trusted...which is the scariest part. |
I think this aspect has been forgotten in this discussion. Is it still proving true? Early on it was basically pages that had ranked on the first page or two, sometimes for years. Then they suddenly plunged to -950.
I'm wondering if some other pages are getting into the mix now. Do some of you have new sites? Or sites/pages that never ranked that well that are now 950ed?
Our sites went -1 PR across the board recently, but our serps went up 10-20 places and for the majority of keywords we are in the top 10. So TBPR has nothing really to do with the serps. So your theory about -1 pr to trust rank is not valid.
|is it not possible that the directory pages that are 950 are to more desirable keywords? |
Nope. That's the first thing I thought of, but the pages that rank #1 are actually some of the most desirable. (thank goodness)
|Im convinced despite some of the sceptics here that this filter is all about increasing revenues by making more sites purchase adwords |
|there has to be some reason why the algo would trip certain pages into the 950 |
I'm really beginning to wonder. I've had a sneaking suspicion all along that this -950 "penalty" is not a penalty at all, rather, an unintended consequence of other unreleated processes within the scoring process. There is no motive for Google to do what appears to be occurring. If a page is crap, it should be flushed completely, not placed at 950. IOW, for what reason would Google instruct their algo to intentionally place seemingly random pages from a well-respected site precisely at number 950 over and over and over, while leaving other pages from the same site at #1.
Over-optimization? Then how does one account for pages that are still at #1 which have the exact same structure?
Thin pages? Then how does one account for unique 4,000-word authoritative articles that are also hit?
And if it is a "penalty," why do some pages recover with no changes to them? Where did the penalty go?
I don't buy any of the theories put forth so far, except maybe the Phrased-Based Re-ranking one.
|...That may be due to thinking that this is "a penalty" when it's really "a way to penalize" -- meaning there are many -950 penalties, and what fixes one situation can't touch another situation because the trigger is some other factor. |
This makes sense. What helped me understand "a penalty" and "a way to penalize", as tedsetr puts it is this patent:
DOCUMENT SCORING BASED ON TRAFFIC ASSOCIATED WITH A DOCUMENT [appft1.uspto.gov]
Read "Description" of the patent - I am convinced (at least at the moment) that G's main search algo operates on the similar principles.
For example take a look at:
| In addition, or alternatively, search engine 125 may monitor the ranks of documents over time to detect sudden spikes in the ranks of the documents. A spike may indicate either a topical phenomenon (e.g., a hot topic) or an attempt to spam search engine 125 by, for example, trading or purchasing links. Search engine 125 may take measures to prevent spam attempts by, for example, employing hysteresis to allow a rank to grow at a certain rate. In another implementation, the rank for a given document may be allowed a certain maximum threshold of growth over a predefined window of time. As a further measure to differentiate a document related to a topical phenomenon from a spam document, search engine 125 may consider mentions of the document in news articles, discussion groups, etc. on the theory that spam documents will not be mentioned, for example, in the news. Any or a combination of these techniques may be used to curtail spamming attempts. |
Look at bolded sentences - it can certainly explain some of the behaviors that are observed. There are other instances of descriptions how ranking/penalties may work. Read the patent, and if you believe that it contains bunch of useful nuggets, let the info sink in, and read it again.
It might even provide explanation what a "one year sandbox" is/was - depending on your interpretation :)
Or I just might be completely off-base...
The whole thing sounds more and more like chaos, the mathematical one. Increase the dynamics of a system, ergo everflux, Big Daddy, and the system runs into chaos, if it's discrete.
Or simpler the more you perturb the system the more unpredictable it becomes.
When the system was updated every three months it did it's reranking and that was it. If they do it daily now, there seem to be unstable areas.
I mentioned the danger of this months ago in one of the Big Diddy threads.
Wouldn't a value in a formula that goes very small suddenly flip to 0 aka 0.00000000000000000000000000001 and then 0 or NAN. That would have such a drastic effect. But maybe a too simple bug to fix for the "Megabrains" at Google. Well depends how big the whole algo construct is now. If any of the programmers was lazy and caught that exception it might not report and on billions of pages and thousands off computers it's maybe hard to trace.
Pure guessing, of course, but it seems such a random, binary switch either, hop or top.
If your algo works fine in 99.9% of all cases would you actually bother to fix it? Took them ages to fix the Googlebombing.
So I add to the already confusing situation the possibility of a bug or a side effect of a system whose dynamic was increased.
Happy to hear any counter arguments of course.
Yes that could be true, we performed very well for 3 months, we must have gotten loads of backlinks. People wanted to advertise with us, users stayed for longer and longer looking at more pages, we installed new hardware so the server was faster and the users got a better faster service. A success penalty. Yet Mr superspammer springs back, maybe he didn't get enough backlinks. But why first penalise him and then retract that statement? This site should trip about every filter in the book..
"IOW, for what reason would Google instruct their algo to intentionally place seemingly random pages from a well-respected site precisely at number 950 over and over and over, while leaving other pages from the same site at #1."
Actually that is what helps make perfect "sense" of what they are trying to do... except we know the 950 is worngly applied randomly in some cases. On the other hand, Google is (even if not related to 950) 100% certain to be TRYING to discern the hacked .edu spam from the legitimate pages on the hacked .edu domain.
Suppose you knew 1% of all pages on the most trusted sites on the Internet are hacker spam. Naturally Google will try and detect those 1% of the pages and penalize them, while leaving the #1 and top ten rankings of the domain alone.
So if you look at this one example, which not coincicentally (hacked spam) has been the #1 pollution problem of the results for quite some time (less so now)... then the philosophy of the penalty makes sense.
Of course the problem is they have done a very poor job identifying the 1%, with a lot getting through and a lot of the 99% being misidentified as being part of the 1%.
The hacked spam problem is significantly less now, and it seems the number of things wrongly penalized to 950 is also down, so it seems they are getting better, but still making a lot of mistakes both ways.
It's hard to believe that a search engine with Google's expertise could mistake a page with an outbound link to "Sam's Barber Shop" for a hacked .edu page with casino links, but they are obviously making some sort of mistake, and I have nothing better to offer in the way of an explanation, so I'll keep an open mind. It just seems like a long-shot to me.
BTW, are those hacked edu pages actually "hacked" or is someone paying to have those pages up?
|I'm wondering if some other pages are getting into the mix now. Do some of you have new sites? |
I have a six-month old site about a 1950s doowop group I shall call "The Widgets" - reasonably niche, very authority (I have access to their archives), no advertising, optimised in the traditional way, not many backlinks yet.
A search on "The Widgets doowop" or "The Widgets [member]" - or other third terms - gets a top ten result.
A search on "The Widgets" gives no results - not even a 950, no pages whatsoever.
Wikipedia is top, and now that I have corrected all the mistakes it may satisfy those who like to plumb the shallows.
There are many factors at play and I am no expert so I will not speculate on the reasons.
But I shall keep reading these threads assiduously...
Lol I found a totally legit commons.wikimedia.org page 950ed ... :) Then a famous German newspaper and so on.
Even supplemental results from a domain are 950. They really got it right.
My company has , through hard work, built up a site that ranks excellently on Google. I have recently built a company website and linked from the index page (only) of the successful site (via the copyright link) to my company site. For some reason Google has stuck my new site at the bottom of the SERPS for any of our search terms. Is it so wrong to publish a PR5 link from the copyright on the homepage of the website that the company owns to its own website? Is this considered spam?
The search for the company name puts us on the last page, how crazy is that for a clean site?
incywincy, if you are right about the cause of your -950 then you have a great chance to test - remove the link and see if you rankings come back. If they do, that would be a valuable bit of information for all of us.
Interesting, I was just wondering whether the location of the link was a factor in this...
Two separate tests I did showed that in one instance a site was EOSed for a phrase that was in the title, h1, alt, etc, but was not EOS for a synonym phrase that was not included in those on-page elements. In the second test, the reverse occured, with a site getting EOSed on a phrase that didn't occur in the elements, but was fine for the phrase that did occur in the elements. Everything else about these sites was held constant as possible, for control purposes.
I'm now conducting a test on location of links -- it has very disturbing implications for people that want to employ some SEO sniping. Footer links are not looking good right now, from the preliminary results.
Has anyone been able to recover from this yet and actually feel it was from on-page changes?
I am wondering if any of these sites offer multiple products and/or services. Thinking that many MFAs also have large navigation (usually extreme) maybe the threshold is off or too tightly turned, causing good quality content to be considered spam due to large navigation.
The navigation could also be seen as unnatural if there are 50 links to 50 products/services pages that are on every page of the site, or every page of each sub-section links to all the sub-section pages. Architecture like this could appear too circular. Combine that with an on-page term filter and it could easily trip up like the MFAs.
Any thoughts on this to those who have been able to read this on-going thread in its entirety?
What if it's not necessarily footer links, but links that are not prominently displayed and therefore never clicked?
If a site has too many backlinks that never get clicked would G be able to factor that in and be see it as unnatural, or a sign of "too much SEO"?
|Has anyone been able to recover from this yet and actually feel it was from on-page changes? |
Yes in some cases, but mine is an informational hobby site not commercial. A lot non commercial sites have been hit with this. And my architecture was not massive links on each page. But I did have a section that had about 17 links from each page to the other pages in that section and I reduced that drastically.
I still think it's phase based. Even though other things seem to be involved look at the phrase based possibilities too. This is why so many hobby and other information sites are losing pages and can't find anything they could have done wrong. They just don't happen to have the right combination of phrases on the page. (that's a simplification of it but gives you an idea)
My site bounced back today. Its coming back slowly.
I had rebuild my site 2-3 weeks ago. Site had about 140 pages.
I had a static site, now i switched to worpress.
I was having around 100 links per page, now around 40-50 links.
I used userfriendly titles now.
I removed all html errors.
I disabled google bot for 3 days. I allowed google bot day before yesterday.
site: command for my site on google doesn't give proper results yet.
But i am back for some keywords.
Hopefully site will regain all its traffic.
Someone have taked drastical actions like erase all website or put a robots.txt prohibiting googlebot index you (/) or fill the elimination form in webmaster tools?
Im thinking in take one of this actions and see what happen.
Its too bad think like this?
|Unless the filter is as such that everyboy gets 950ed and then they get reranked based on what ever is en vogue today in the Wikinountain View. |
For some pages, I see the 950 effect happening at a few datacenters (for certain phrases). Then, two days later, back to normal on all datacenters. Maybe some testing that causes the effect to appear, then disappear?
Re: Disallowing GoogleBot.
Is it still the case that if you do this, a competitor can tell Google to formally 'dump' your site altogether for 6 months, as happened to WebmasterWorld? Wouldn't it be better to, say, disallow all save the index page, perhaps?
| This 195 message thread spans 7 pages: < < 195 ( 1 2 3 4 5  7 ) > > |