| This 226 message thread spans 8 pages: < < 226 ( 1 2 3 4  6 7 8 ) > > || |
|Google's 950 Penalty - Part 8|
< continued from [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >
I don't think this is related to reciprocal linking. If it is overdone to the point that Google sees a big red flag some other penalty might click in but not this 950 thing.
Phase based seems a lot more likely. And there may be something about the words or phrases used in internal linking involved as well. Or that could just be a part of the phrase based thing.
[edited by: tedster at 9:15 pm (utc) on Feb. 27, 2008]
|theme is nothing to worry about since Google has virtually no conception of it. |
You're right steveb, from what I can see. Not one of the big three search engines seems to get theming very well. Google shows a hint, a mere whiff, but nothing more.
The phrase-based patents may be Google's attempt, at least in part, to measure this area more closely. But if those patents are in play, then they seem mostly to be causing false positives and collateral damage, at least from where we sit.
Dunno .. I can really only see menus and increased amount of unsolicited IBLS for my 950.
I downsized my dynamic menu to 0 and kicked 2 internal menu links out.Can't do anything about the domainspammers.
Whatever my Googlecrime is, it seems to be worse then more than half boilerplate template on a page the rest dual content en masse.. of Mr spamomatic.
|... which means that the improved rankings didn't come from a new cache or crawl of the page... but from something internal with google. |
I hope this means that Google is trying to refine the filter because they realize that it's catching a lot more than spammers right now. I think they are trying different adjustments of the phrase based filter to deal with spammers because they know they are catching non spam pages in error. At least I really, really hope so.
|Can you explain duplicate content? |
It may not be duplicate content but similar content on your site that for some reason Google had decided is more important. I've had this happen. If I look at the regular results when I search using a well matched search phrase Google shows a page from my site that I consider less related to the search phrase but if I go to the supplemental results Google shows several of my pages in the cluster in the top 10 and the page that I expected to rank at the top is in that cluster.
If it's truly 950ed it will be down around 950 when you look in the supplemental results.
|Twelve hours after Google informed me they removed the stolen content my site was back. |
I always use Yahoo backlinks to see who has linked to me. There is always a lot of spammy stuff there but the trick is to figure out which ones are causing the problem. People keep saying inbound links can't hurt you but with these scraper sites I'm not so sure. The only other time I've had ranking problems was during the Bourbon update and it turned out that the site was hijacked by a scraper site. It appeared it was not on purpose but just their sloppy linking. I asked them to remove my links and they did. My site came back. So I'm not leaving out the possibility that scrapers may be causing some of the problems. Typically high ranking sites get scraped more and a lot of the 950ed pages are from old reliable sites that always ranked well in the past.
|theme is nothing to worry about since Google has virtually no conception of it. |
I disagree because it could well be that a lot of our problems are related to botched theming.
950 thread is getting way too long.
Just today my 2nd site got hit by -950 penalty.
Does it has to do with private whois data?
Too many links on every page? Cross linking?
I think there will be lot of factors not just one for 950 penalty.
Can someone just list all the possible reasons for this -950 penalty?
Has anyone got his site recovered yet?
[edited by: Thaparian at 4:24 am (utc) on May 5, 2007]
i have seen a consistent link between successfully getting sites to remove copied content and the copied pages returning from -950 to original rankings...
clearly this doesn't affect everyone and seems to indicate as had been noted many times... -950 affect is the final result of a number of filters....
as the many standouts seem to be links and duplication it would seem offpage factors are the real areas to look at....
soapystar, did the copied content that you had removed include a link to your site -- at least on the version of the page that Google cached? Scrapers have been known to cloak a backlink, so I'm wondering if the link may be the problem more than the copying.
nope..no link. I know everyone talks about scrapers but im talking simply about websites using your content to bolster their own, often very legitimate sites. There are no links back. And it doesnt have to be much. Just paragraphs can affect you. It is possible i think that this comes about after a tipping point is reached where a given amount of your text appears on a given number of other sites. Once you reach a tipping point of duplicated content you are open to small copies of a page bringing that page down.
|...It is possible i think that this comes about after a tipping point is reached where a given amount of your text appears on a given number of other sites. Once you reach a tipping point of duplicated content you are open to small copies of a page bringing that page down. |
This is what I think I'm seeing on one of the pages I described earlier.
Google has been much better than other engines, I feel, in detecting these kinds of duping issues and fixing them... and often I'll see a page which has been copied drop out for a few days and then come back.
But if other aspects of the page are on the edge, then for an extremely competitive phrase where the inbounds just aren't super strong, this ongoing duping might be enough to send it to minus-950 land. The page will continue to rank on other phrases.
PS to the above... One thought I've had is simply to rewrite parts of the page, removing some of the fluff that's been copied and perhaps even improve the page.
I'm thinking this could be risky to do, though, for two reasons...
a) the page might loose "seniority." Google at that point could see the other pages as more original than ours. This is conjecture. I don't know if Google looks at dupes this way, assigning whole or partial seniority to content. I'm wondering what others have experienced.
b) the page is still ranking on other competitive phrases, and, with (a) in mind, I don't want to risk those rankings.
I often updated my sites due to actualized product data, new pictures and application reports and so on...
Never had a problem with that, though my site is being scraped all the time and at a certain point I've given up to catch up with this vampires, otherwise this is a 24h/7 days a week job.
Can't really imagine this is the point - we're talking about pages on the web, of which actuality should be one of the major advantages compared with other media. So you have to put them up once and then leave them unchanged until the end of time, otherwise you become a victim of scrapers in combination with an overtuned Google-algo to send you to 950?
What about dynamic sites like blogs or forums, so they're a number one target to be pushed to 950 if scraped? Is there any evidence for that?
Itís getting worse, I observe a steady grow of well-known international sites in our niche, more and more pages of these sites appear in the 950 range (filter=0). If this continuos they are going to kill a whole on-line industry in favor for one single site, which at the moment takes position #1 to #30. One thing they all have in common is a unique phrase, a name!. This is playing with fire!
I am pretty sure my 950 is caused by the botched Googlebomb fix. I have around 500 IBL off topic with the same anchor text inbound now. That's basically like a Googlebomb. And all Google seems to have done to fix their Googlebomb issue is throw sites that have these similar IBL from lame sites to the end of the index. That's at best a friday night workaround .. .
This domainspamnetwork uses an old DMOZ clone as standin content . Medium rated sites like mine would be affected... Megasites probably not.
It would make a great deal of sense given how it is hitting old, established sites...usually when the anchor text isn't used on the page in question, and when the inbound isn't perfectly on theme.
I've spent more time looking at the end of serps phenomenon than I have for any previous penalty, filter, or algo glitch.
If you are indeed correct, how does one get the googlebomb to work again? Matching the on-page text to inbound anchors and get those inbound from trusted, on theme sources (well, that seems to be possibly more how to avoid it -- getting sites back is proving to be tricky). Also, if you are correct, I don't think we'll see much of a fix in the near future, given there has been absolutely no whisper from Google on the collateral damage this has caused. Who wants to eat crow after publically announcing an end to googlebombing?
All in all, I think we are close to solutions to get out sites, regardless of why they are thrown into the EOS. It is just more involved and time consuming than I would have hoped, relying far more on webmaster responsibilities than proper treatment of the site by Google.
|I am pretty sure my 950 is caused by the botched Googlebomb fix.... |
It does seem there is some relationship between the Google bomb fix and the disappearance of rankings for synonyms that I mention above. The early 950 reports, though, preceded the Google bomb fix by a month or so, but that doesn't mean there isn't a relationship. Somewhere, I remember reading that the Google bomb fix came as a result of something else that Google was trying, but I can't find that reference.
It may also be that the particulars of the -950 have been changing over time. If the -950 were a "bookkeeping" location for a range of problem pages (say, pages on trusted sites that were hitting a combination of filters), then the timing of the Google Bomb fix might not be crucial to the first -950 appearances. They certainly seem related.
With regard to my pages that used to rank on synonyms and have now been 950ed... I see that... in order to be included in the top 1000... the pages that are 950ed apparently need to contain all the words searched.
The pages I'm seeing that used to rank for synonyms in inbounds, but which don't contain those synonyms, have disappeared entirely. The only 950s I see contain one or two instances of all terms in the search phrase.
I hadn't check this, though, prior to the Google Bomb fix. There wasn't that much talk of 950s then.
I did look for -950 results on the miserable failure terms when the 'fix' was put in. I also looked at it later when the word "failure" appeared in the White House bio for a short stretch of time. When "failure" was in the bio, the bio again ranked #1 for "failure," but it didn't rank at all for [miserable failure], a less competitive term. Then "failure" was removed and the page stopped ranking for either variant of the bomb.
During this whole episode, I checked but didn't see any White House 950s for these terms.
So, it makes some sense to think that similar types of filtering might be involved. For the most part, both the 950s and the above cited Google bombs apparently require an onpage match to keep a page in the first 1000 at all.
But, as Danny Sullivan recently reported [searchengineland.com], the Colbert Report's home page is currently #1 for an apparent new Google bomb, and none of the terms appear on the page...
greatest living american [google.com].
If you check the Google cache for the Colbert page, you get this message...
|These terms only appear in links pointing to this page: greatest living american |
The same thing happens with the click here [google.com] search that brings up the Adobe Acrobat Reader download page, and several others.
The intriguing question of course is why do some Google Bombs rank and some don't? It's likely, I think, that this could have bearing on what we're seeing with inbound anchors and onpage text.
While this is all conjecture, it is a reasonable line of inquiry.
The 950 issue predates the googlebomb thing by a year. That obviously isn't key, although of course it could have made more of the problems.
Steveb - I've seen you post many times about experiencing the 950 penalty early 2006, but I haven't noticed anyone else being hit with it that early. It seems to have become more common very late 2006 - like December.
Was anyone else here hit with a 950 type penalty before December 06?
|The 950 issue predates the googlebomb thing by a year. |
|Was anyone else here hit with a 950 type penalty before December 06? |
If you go back to the very first thread in this discussion [webmasterworld.com] you will see some comments such as "We experience this last April, May and Jun".
Our discussion here raised awareness of the issue around the web - was picked up by various blogs, etc. Up to that time, many people may have thought their url just disappeared. So did the rise in reported -950 troubles happen because more people started checking the end of results? Or did Google make a change, tweaking an existing algo component to a new level and throwing many more sites to the end of results -- that's the question, isn't it?
Google announced the googlebomb tweak [googlewebmastercentral.blogspot.com] on January 25, 2007. Surely they were working on the tweak during the weeks preceding the public announcement for QA purposes, so the timing is suggestive but not conclusive.
The exact quote on the blog now is this:
|So a few of us who work here got together and came up with an algorithm that minimizes the impact of many Googlebombs. |
That sounds like they introduced something new, but the language is a bit vague. I know there was also a comment somewhere from Matt Cutts at the time about smart people tweaking exisiting algo components to achieve a new purpose, or something like that. - Will keep looking for that.
"Was anyone else here hit with a 950 type penalty before December 06?"
We had a ton of threads when it first appeared on a widespread level, September 22, 2005.
After that data refreshes were about monthly, with damage and recovery in October and November, and then a large recovery in December 2006 with a parallel large amount of penalization.
The thing is at the time there wasn't a lot of note comparing about how all these pages were being stuck at 950, but the basic phenomenon was there... EITHER a few pages ranking absurdly on domains with otherwise excellent rankings, OR 95% of a site's pages penalized while bizarrely a few pages would seem entirely unaffected (still ranking #1).
The threads are all there from 2005, plus posts every month after that as some more people were hit and others recovered.
Clearly the problem is much worse now, both in terms of raw numbers of collateral damage and because data refreshes take place four or five times a week now.
A couple threads from back then:
"Recent dates where I have seen this same activity are 9/22/05, 10/15/05 (seemed like an adjustment to 9/22), 12/27/05, and now 3/8/06..."
"One of my websites lost practically all Google traffic on 22nd september and there are no signs of it getting back. The filter is the strongest I've ever seen. It ranks last for all searches..."
So are we saying that if youre site is widgets.com but known as "Widget Blue" and youre site attracts loads of inbounds that say "Widget Blue" thats a googlebomb? so now you site gets hit with 950+ penalty?
What ever the case this is such a crock from google - did this need to be introduced? Or is this just a move on googles behalf to take out loads of good sites to up adwords revenue?
This 950 business has become a farse now with far to many sites in the index affected
|So are we saying that if youre site is widgets.com but known as "Widget Blue" and youre site attracts loads of inbounds that say "Widget Blue" thats a googlebomb? so now your site gets hit with 950+ penalty? |
Possibly, if your site isn't about "widget blue." (Why else would your site attract large numbers of inbound links with identical--and irrelevant--anchor text?)
If your site is called "Widget Blue," is about that topic, and has loads of inbounds with the anchor text "widget blue" (as one might expect), the explanation for any loss of rankings--or any penalty--is likely to be more complicated. (We wouldn't be in Part 8 of this thread, with no end in sight, if the explanation were simple and obvious.)
what I don't understand is why my page at -950 if listed with the same title and description inside a Certain Directory (page rank 7) it will show itself on google first page for the same search but inside the directory page.
this is incomprensible since the Directory has not anchor text, no relevancy, no keywords and no theme for that page.
|So did the rise in reported -950 troubles happen because more people started checking the end of results? |
I never dreamed of looking down around -950 when I found pages were missing. If it hadn't been for the threads on this topic I'd probably still just think they had completely disappeared from Google.
I did have occasional missing pages earlier. I may have had more than I realized as once I knew some were 950ed I looked to see if others were.
|Was anyone else here hit with a 950 type penalty before December 06? |
Another example of it was the October Massacre [webmasterworld.com] on 21 Oct 2006.
I said _my 950_, I didn't say it's the only reason and others might have a different setup. On the other hand Google does test things before they officially announce it, I assume.
This spamdomainnetwork has loads of Texts like Search engine Optimisation, SEO and so on in the on page text combined with links to me on the top spot from that old DMOZ page. It looks as if they downgraded that combination ... too. Especially since I have never done seo to any extent, this sucks.
The up and down started for me in January, now it's only down ...
Whatever you want to call it, it seems that loads of crappy/unfitting/bogus IBL's let you sink like a stone with no return now.
I noticed something. Looking for my site for a search term (where I used to rank #1 six days ago), I went all the way to the last page and couldn't find it. Then, reaching the last entry, I clicked to see the results with the omitted entries. Lo and behold, I was at #1! I thought I made a mistake and repeated the search again on a different DC - found the same, but on the 3rd and subsequent DCs, I couldn't replicate it again.
Can anyone tell me what that means or has anyone experienced the same?
|it seems that loads of crappy/unfitting/bogus IBL's let you sink like a stone with no return now. |
I thought it might be caused by scraper sites at first now I'm not so sure. It's too spotty.
Is there really evidence that unsolicited inbound links can hurt us?
I would agree with you annej but I think there is also some ratio system in play where for example one good quality IBL might might be woth x number of poor IBLs there by creating some kind of equilibriam and reducing the negative effect of poor inbounds.
[edited by: LineOfSight at 3:10 pm (utc) on May 7, 2007]
what ever the case - we now have far to many good sites out of the index.
That can not be good for search quality can it.
Profits at google may well be up but its been at the cost of quality for the end user. Do a search on some popular search terms on google then on Yahoo and look at how many authority sites are missing from the google index.
IMO google is killing its searh quality with its own desire to increase adwords income, this is what the 950 club is all about.
|I would agree with you annej but I think there is also some ratio system in play where for example one good quality IBL might might be woth x number of poor IBLs there by creating some kind of equilibriam and reducing the negative effect of poor inbounds. |
That would make a lot of sense, although other factors may be at work, too. (E.g., if you're obviously part of an elaborate and unnatural linking scheme, "one good quality IBL" may not be able to save you from being punished for your sins).
As for Rich's suggestion that a -950 penalty is based on a plot to sell more AdWords, I'm skeptical, because there are plenty of "authority sites"--including e-commerce and other commercial sites--that rank just fine for searches with tens or even hundreds of millions of results.
|creating some kind of equilibriam and reducing the negative effect of poor inbounds |
Does that mean that the strength of good inbounds is diminished if you have been linked by a lot of scraper sites? One problem with doing well in the search results is that you get a lot of scraper sites linking to the page. There really isn't anything we can do about that.
As to Google's purpose I think these filters are their attempt to get rid of spam sites. It's right there in the phrase based patent to get rid of spam. The problem is that it's impossible to do this by algo rather than humans checking them without non spam sites getting caught in the filter.
I just hope that Google is watching this and making adjustments to improve the filter. I think as they get a better grasp on phrases that do not indicate a spam site it will help a lot.
I do not think this is some evil scheme to get people to buy more AdWords. If it was why are they hitting information only sites which don't sell a product at all?
In other words it's a botched filter not an evil filter. We already have evidence that they are working on it. That's why a lot of pages are coming back with no changes.
Understanding the intent of Google engineers, and
looking at the data of the Keyword Suggestion tool of AdWords...
... will provide more hints to people at this point than the last few hundred posts of this thread.
You can't possibly solve all cases, they're just too numerous and people will want solutions specific to them.
The problem is:
Your anchor text is seen relevant for a competitive phrase, but you don't have the support for it.
- because it uses words Googe identifies and monitors as a competitive phrase, but competitive for something else ( as in, it doesn't recognize the phrase in the meaning you use it ) ...OR... you used too broad of a relevance... OR... it means something else ( as well )... OR... you knew exactly what it is competitive for, and that you didn't have the inbounds to make you relevant but you thought it was OK because it's on topic ( a human editor would understand, but the semantic analysis of the link-a-holic bot doesn't make much of it )
- because it uses the phrase in context, but in a context that's not understood, or the phrase is more competitive for something else. ( to matt: Tokio Hotel is now identified as a theme of its own. SERPs will not force accommodation sites, neither try to correct the syntax. Not even in other languages. )
You're irrelevant because:
- You don't have enough inbounds to support the phrase
- Don't have enough derivations in your incoming anchors
- Don't have variation in your anchors
It's a dictionary to identify competitive phrases. And not a dictionary to identify several billion non-competitive ones. Annej had the name of the war on the site. From our perspective it's that Google has had yet to identify a word combination as a legit phrase for the website. From the real perspective however... Google didn't identify it as a whole phrase, so the analysis fell back to see if this was to mean something about a different combination of these words, and just about a step back it found something. The phrase was on their list of monitored, competitive phrases. The system works, just not for webmasters.
For this dictionary is built by - guess who - ...AdWords users. That's the reason, the method, and the market where Google has to monitor these things THIS closely. They don't have sets of phrases on themes they don't need to monitor. Collateral is the semantically relevant phrase with a different meaning, or the semantically unrelated, yet very much on topic phrase, that's also competitive because of words Google is sensitive to.
If there's a phrase people bid like hell on AdWords, they won't let you rank for it just because you include it on your site, yet you don't have the inbounds to support it, only a navigation that you invented. But... since this is based on AdWords data, it is for monitored phrases ONLY!
They are NOT building an AI to understand themes, they are building a filter to identify word combos that are a query people bid on, and need to be monitored. If they let people rank for combinations with a 3rd, 4th word ( which WE identify as a legit topic, but they don't ) ... one could bypass AdWords.
Check your problematic phrases in the Keyword Suggestion tool thing at AdWords. There's an external one you don't have to be a paying customer.
Those are the relations they know of, and using this data you may rank / be penalized for something related / unrelated, yet semantically irrelevant / relevant.
- What are its synonyms?
- Is it actually identified in whole or in parts as something people bid hard money on?
- Is it identified at all?
End of story
Meaning end of post 151 in part 8. Move on to post 152. Funny. Does anyone remember their OWN posts anymore? This should be read in context of the previous ones too.
[edited by: Miamacs at 5:05 pm (utc) on May 7, 2007]
| This 226 message thread spans 8 pages: < < 226 ( 1 2 3 4  6 7 8 ) > > |