homepage Welcome to WebmasterWorld Guest from 54.196.57.4
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe and Support WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 195 message thread spans 7 pages: 195 ( [1] 2 3 4 5 6 7 > >     
Google's 950 Penalty - Part 9
annej




msg:3336309
 9:13 pm on May 10, 2007 (gmt 0)

< Continued from: [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >

That's because we are shooting in the dark

We really aren't shooting in the dark. We are shooting at dusk. We can see a fuzzy image of what is out there. Sometimes when we shoot we hit the target and other times we can't no matter how hard we try.

But we waste our time when we start shooting at theories like

- Google is doing this so more people will pay for AdWords

- Google only hits commercial sites

- If you have Google analytics you will sorry

- Only sites doing something illegal are hit by -950

- It's because you have AdSense on your site

- Scraper sites are doing this to us

It goes on and on.

Is it because the phrase based theories are not an easy answer? It does take a lot of work to figure out why you might have been 950ed and sometimes you just can't find the answer. But I still believe that most 950ed pages have been caught in an imperfect phrase based filter.

[edited by: tedster at 9:14 pm (utc) on Feb. 27, 2008]

 

trinorthlighting




msg:3336330
 9:28 pm on May 10, 2007 (gmt 0)

Certain phrases/combinations can send you into a spam or safe search penalty. That is why. Think about it. Google has filtered adult sites (Safe Search) by phases or words right?

Now, take that over to the search side. Letís say Google wants to fight "free ring tone" spam this week. What are they going to do in the algo to fight this?

Google would look at anchors, text on pages that contain derivatives of Ring Tones and Ring Tone keywords correct? They would play around with the algo, monitor it for those phrases, test it and see how it does. If they like the results and they seem to have a good answer to get some of the "free ring tone" spam out of the index. They implement it.

Now, since Google does not want spammers gaming them, they do not talk about it and make it a very sneaky and stealthy penalty and downgrading pages that just do not fit the calculations right? Thus -950. Very stealthy bugger so people will not really see it. Itís a slow and long fall to -950 and it does not happen overnight. May be -10 today,-30 next week, etc.. And will not be able to figure it out to game Google because itís slow. Once you reach that last page, the next step is supplemental. Itís the best and slowest penalty Google has yet to implement. Want proof, pick a keyword and monitor it daily and see what happens. -950 urls go supplemental and new one replaces them due to new pages coming into the index. Look at the -950 results and 80% of them are spam. So itís obvious itís a spam fighting penalty.

So back to the ring tones, you have a site that talks about telephone ringers and here comes the spam algo and for some reason it does not like your page. (Collateral Damage). This is due because you do not have enough trust rank for the keyword (Per Miamacs post)because your site is new and has not been around that long.

Google groups keywords together and I am sure there are quite a few algo's going around scoring pages in the background depending on keywords. Different algo's, different results.

Now, why do I keep saying links? Here are the reasons:

1. <I have not yet seen> a site yet that has not shown mass link exchanges, paid links, "recip links" or that has not been hacked. (Hacked is the exception)

2. Trust rank, trust rank flows through anchor text. Again a part of links.

3. Lack of good and relevant links. (Collateral Damage Pages) show a lack of authoritative and trusting links.

All in all, look at the majority of results that are -950. Most have been in link exchanges or involved with paid links. Is there collateral damage? Yet to see it but I am sure there is. How much collateral damage? I have yet to find one so I can say itís probably around 1%.

Now, the .edu sites we see, either hacked or not real .edu sites. A lot of spammers have look a like .edu spam sites. Do some research and you will see that.

[edited by: tedster at 9:33 pm (utc) on May 10, 2007]

tedster




msg:3336345
 9:45 pm on May 10, 2007 (gmt 0)

The "spam detection by phrase based indexing [appft1.uspto.gov]" patent does not require any "this week's special target words" or any importing of bad phrases from another department of Google, either. That patented method detects spam phrases all on its own, through another method that the patent describes.

I'm with annej on this one - it still looks like the best fit for all the evidence I see, including sites that have squeaky clean linking histories. That patent uses many factors to label a url spam, and links are just one part of it.

Itís a slow and long fall to -950 and it does not happen overnight. May be -10 today,-30 next week, etc..

I think you're seeing different parts of the entire Google algo kick in. The -950 can most definitely act on a dime.

tedster




msg:3336361
 10:05 pm on May 10, 2007 (gmt 0)

Certain phrases/combinations can send you into a spam or safe search penalty.

Yes, that is the point. Some urls can rank #1 for a search but be 950'd for another phrase. But those words alone don't automatically invoke the penalty just because they're on the page. Robert_Charlton mentioned earlier that he has such a page and he is not willing to play with, it least he lose the rankings that are still driving traffic.

The -950 only shows up if phrases are used in a certain way, on-page and/or off-page, and with certain related phrases. After all, there is still a first page for any phrase you search on that has any results at all, so the mere presence of a word or phrase is not an automatic penalty.

I like the area that trinorthlighting brought up earlier -- words and phrases that have multiple meanings. That's an intriguing possibility for why there are false positives, at least in some cases. I'm going to study that one a bit more.

Marcia




msg:3336478
 12:29 am on May 11, 2007 (gmt 0)

Polysemic words and it's part of keyword co-occurrence, which is what that whole thing is based on. That's why phrases work much better than single terms.

That does not necessarily have anything whatsoever to do with spam, and one word tacked on to a phrase can trigger stats for lack of information gain for the phrase. Nothing less, nothing more. And the system isn't based on a pre-fixed lexicon, it generates the taxonomy based on co-occurrence figures.

trakkerguy




msg:3336509
 1:24 am on May 11, 2007 (gmt 0)

Thus -950. Very stealthy bugger so people will not really see it. Itís a slow and long fall to -950 and it does not happen overnight. May be -10 today,-30 next week, etc.. And will not be able to figure it out to game Google because itís slow.

Really? Is that what you've seen? I've only been watching as 4 got hit, but they all went straight from first page to -950. I thought most that posted here also described a single drop and not a slow fall...

steveb




msg:3336530
 2:15 am on May 11, 2007 (gmt 0)

"Itís a slow and long fall to -950 and it does not happen overnight."

What are you talking about?
950 is basically always "overnight". There is no in-between.

And more importantly there is no "fall". That's the point. Pages normally either rank in the top 20 or at 950. There is no fall or in-between.

JoeSinkwitz




msg:3336566
 3:29 am on May 11, 2007 (gmt 0)

The part about the above posted patent that bothers me most is that it only takes one false positive to really gunk up the intended purpose:

1. Sites are getting re-ranked, with the appropriate spam getting moved to EOS.
2. One fairly normal site with slightly too high usage of related phrases or singular phrase gets sent to EOS {and added to the list of spam documents}
3. Re-ranking a little later in the future nabs a couple more sites since they are of relatively close proximity of usage to #2 (even though they were under the threshold before). Send to EOS {and added to the list of spam documents}
4-99. More and more re-rankings remove more and more very relevant sites that don't appear to be doing anything remotely shady, due to the appropriate usage of text that is now "significantly exceeding" the expected usage by a standard deviation or predeterminded multiple. Send to EOS {and added to the list of spam documents}
100. What is left is a bunch of unrelated hubs that barely mention the phrase in question and aren't really relevant. They are close to possibly being relevant, but not exact. Sound familiar?

Aside from de-optimizing, what else can be done? That all depends on your niche and your resources. If you can flood a couple hundred results with something that technically appeals to the re-ranking variable, all the while increasing the allowable usage of text for the rest of the localset, the pages you really want to rank might get either pulled out of the spam document list, or left alone to begin with. Alternatively, if you having been on page #2 forever, de-optimize your site a bit, then flood the index with de-optimized domains so that the top guys get hit for being too optimized. If you have a dark streak that is.

I hate the idea of de-optimizing a well-written document that serves the user, for the sake of dumbing things down with alternative vague descriptions, serving a re-ranking variable that probably isn't ready for production due to the obvious flaw listed above.

Cygnus

[edited by: JoeSinkwitz at 3:32 am (utc) on May 11, 2007]

tedster




msg:3336585
 4:02 am on May 11, 2007 (gmt 0)

I agree Cygnus. The patent seems to say this approach would set very high thresholds in measuring abuse. That's one reason I am NOT saying that this patent definitely is what's in play. But it does describe a mechanism that could generate a lot of the signs we see. That is, if this patent is in play, then the thresholds are not being set very high -- or maybe they must be hair trigger because spammers are getting too subtle these days?

[edited by: tedster at 4:31 am (utc) on May 11, 2007]

JoeSinkwitz




msg:3336588
 4:18 am on May 11, 2007 (gmt 0)

I agree that this is probably part of it, but like you say...not likely all of it, or at the very least, not in its full incarnation (otherwise we probably wouldn't be complaining).

According to the patent the normal non-spam doc has something like 8-20 related phrases, with spam docs being above 100, but in examination of EOS sites the 8-20 phrase rule is far closer to what is going on, so that figure is being thrown out of whack.

A funny way to think about things is that if Google messes up on the first time through the ranking, re-ranking will only make it worse, since relevant sites will be thrown out. A ton of super-authority non-relevant sites (like newspapers and edus) can drastically alter the expected phrase scores.

Thanks for posting the patent again though...it was helpful to re-read. I'm going to drastically de-optimize a few sites, since it'll probably bring them back, and then start readding the text users want to see piece by piece [assuming some of the longer lasting EOS issues ever leave the spam table...some I've seen do back and forth, some just once a month for a few days, and some having come out since Jan].

Cygnus

mattg3




msg:3336750
 9:15 am on May 11, 2007 (gmt 0)

or maybe they must be hair trigger because spammers are getting too subtle these days?

The spammy site that jumped back has
Keywords: TWENTYFIVE
SITEWIDE boilerplate template
scraped snippet
SITEWIDE boilerplate template
insert from a very small forum
minimal menu
subdomainspam

No adsense but affiliate popups and some adsense clone that could be misinterpreted possibly as unique text. But in no way does it offset the repetitive stuff.

The three month old hint that sitewide boilerplate templates are a nono must have been reversed or there is a bug in that algo.

mattg3




msg:3336758
 9:37 am on May 11, 2007 (gmt 0)

3. Lack of good and relevant links. (Collateral Damage Pages) show a lack of authoritative and trusting links.

I dunno how many you need but I can assure you that I an linked on all major sites in my sector, slowly growb=n since 1996 all within theme on non profit sites, the Association of my profession the major forum on our topic and so on.

BUT .. I have now a domain spam network linking to me, I didn't ask for it, but they now also mention SEO and bla on their sites.

How is Google gonna test if I bought these links, they can't , it's a ridicolous assumption to suggest they could unless I would have been so stupid to order please 3000 ibl's over gmail.

Google doesn't seem to get the refinement of pool and swimming pool, how can it possibly detect a complex human action.

But what they can do is # of ibl with same keyword and low PR > than x percentage ... RANK - 950 and accept the collateral.

Given that the 950 is near instant it isn't a gradient.

It's an if z > than x **** off to the end of the serps.

I find many pdf's at the EOS. They look completly harmless, have obviously no popups ads and so on. no boilerplate templates. Just plain text.

Biggus_D




msg:3337102
 3:34 pm on May 11, 2007 (gmt 0)

Let's say that this thing is phrase based.

I can't edit thousands of articles (and I guess that I'm not the only one) but we can try to do something with the new ones, while we hope that Google improves those algos.

I've read the patent invented by Anna Lynn Patterson but I'm not so good to get it.

Please, can anyone offer some tips about what to avoid?

Something like:

- Do not put the widget name more than 2 times

- Do not use words like...

tedster




msg:3337157
 4:32 pm on May 11, 2007 (gmt 0)

That patent does not reduce to a simple set of "rules". And there's not enough detail in it to get precise numbers or advice -- and there are too many areas being measured, on-page and off-page. Plus, we aren't even sure that this patent really IS the culprit. There's just a circumstantial case for it.

Thus annej's comment "we are shooting at dusk".

Crush




msg:3337399
 8:16 pm on May 11, 2007 (gmt 0)

Matt cutts giving snippets away

[seroundtable.com...]

Adam's already doing a great job on that thread, but it is frustrating that I don't have a chance to do everything I'd like to do. If I've only got limited time, I could spend that discussing something or a forum, or try to write on a new topic (malware, Stephen Colbert, robots.txt crawl-delay and why we don't support it).

"annej, regarding the -950 thing, I'd watch this video I made: [video.google.com...]
Starting around 1:42 into the video is where I talk about this.

[edited by: tedster at 5:49 pm (utc) on May 12, 2007]
[edit reason] added quote box [/edit]

tedster




msg:3337434
 8:53 pm on May 11, 2007 (gmt 0)

Thanks, Crush - very interesting. So this algo element has been in place for a year and a half (at the time the video was made - that lines up with reports here) and it's designed to penalize for "over-optimization". I watched this video before but I missed the connection to the -950 that Matt just highlighted.

johnhh




msg:3337579
 11:59 pm on May 11, 2007 (gmt 0)

This is interesting - as it could be a phrase based "over-optimization" penalty that could bring problems when writing good content.

For example, previously we had , what I thought was a major -950 problem, but in fact the pages were appearing in, often, top 10 SERP places but not for the theme of the site.

For example if the theme of a page was "blue widgets technical specification sheets" we were finding that the pages were OK for the SUBJECT "blue widgets" - in fact very good positions - but not for the targeted THEME "blue widgets technical specification" within the SUBJECT "blue widgets".

As it is pretty difficult to write an article on "technical specification" without mentioning these words - individually or together a number of times it would appear that maybe the mere mentioning of these words or related words, perhaps instructions/construction/etc/etc, a few times produces a potential -950 page.

Specific examples are easier to explain but UTC's etc

As we were doing site redesigns at the same time we also noticed that even redesigning made no difference to "strong" pages - once at positions 1 to 8 always number one to eight - so probably these pages are supported by quality IBL's. Hence my comment above about "potential -950 pages"

Pages that were borderline ( SERP positions 10-20'ish )dropped out , often to -950. Attempting to reinforce the "theme" on these pages by use of Title/Descriptions didn't seem to make any difference.

We also tried writing vague Description/Titles etc to these pages avoiding the phrase "technical specifiaction" - no joy there as well.

The above is the result of 3 months or so of work on this on and off, but may well explain anniej's experience of what happens if you remove the page content.

To make matters slightly worse the whole site is about "widgets technical specifications" not the blue/pink/yellow widgets themselves.

And I'm terribly sorry Mr Google but if you want me to rewrite hundreds of pages of content.......

Disclaimer - all sites are different so I report the above as is - no warrenty!

[edited for clarity]

Marcia




msg:3337594
 12:11 am on May 12, 2007 (gmt 0)

Certain phrases/combinations can send you into a spam or safe search penalty. That is why. Think about it. Google has filtered adult sites (Safe Search) by phases or words right?

Wrong. No, Google has not filtered adult sites. WE choose to have them filtered out by setting our own preferences.

Adult sites do not equal spam sites, those are two different issues completely. There is no safe search penalty. Spam penalties, yes. Adult/safe search penalty, no.

Now, take that over to the search side.

No, let's not. It's wrong in the first place, so let's not take it anyplace.

The results of a person's own reasoning may or may not be the way things actually are. It all depends on the basis of their conclusions, whether or not the facts they're basing their assumptions on are correct or incorrect.

annej




msg:3337615
 1:06 am on May 12, 2007 (gmt 0)

I watched this video before but I missed the connection to the -950 that Matt just highlighted.

I did watch the video and it was interesting but I'm still having trouble seeing the connection to the -950 thing. Most of the pages I lost had been up and ranking well for years. Are they saying there is suddenly over optimization on old pages? Plus I don't see any difference between the pages that were lost and the pages the are still doing well (which is most of my pages).

The way I optimize is to write titles related to what the page is about. I have found that good page titles help ranking more than anything.

Since my articles range from 500 to 1000 words I do have to repeat the name of the hobby/topic several times simply because a lot of what I say would be confusing if I don't. I've even tried to reduce the word in some of the 950ed pages but it really hurts the clarity of the article and didn't seem to help in terms of the penalty.

The only SEO I've done of late is to try and figure out what Google thinks is wrong with the 950ed pages. As I have pointed out in a earlier message I haven't lost much income because of the but it bugs the heck out of me to see a page with well documented information on it that is not found anywhere else on the Internet so low in Google that anyone researching the topic would never find it.

I can't figure out if the folks at Google just don't see that there are some problems with their filter/algo or if they just don't want to talk about it. I surely hope they realize there is a problem there.

[edited by: tedster at 5:47 pm (utc) on May 12, 2007]

johnhh




msg:3337628
 1:34 am on May 12, 2007 (gmt 0)

Are they saying there is suddenly over optimization on old pages?

yes I think so - or was borderline and now gone over the new border and/or not supported by strong IBL

I do have to repeat the name of the hobby/topic several times...

It may not be the actual word but a related word or a related word in combination with the actual topic/hobby word.

Although this element of the algo may have been in place a long time it may be that it took some time for the effect to show through.

annej




msg:3337646
 1:57 am on May 12, 2007 (gmt 0)

not supported by strong IBL

I think this may be one of the shifts that have occurred. I do have strong IBLs to my home page and most people link to it as they like the whole site. But now there may be an expectation that there be inbound links to every individual page. (excluding scrapers that is)

It may not be the actual word but a related word or a related word in combination with the actual topic/hobby word.

I think you are right but it can be next to impossible to figure out what that other word/phrase might be. I'm talking about articles from 500 to 1000 words and it could be anything in the article.

webastronaut




msg:3337756
 3:26 am on May 12, 2007 (gmt 0)

I've gone way beyond keywords that I follow to see what's up in the 950 hell and I see dmoz there, books.google.com pages, news.google.com pages there and many big sites there for keywords that result in 50 million plus types of results. I add a word to make it a 3 keyword search and these same pages make it to page one...?

I have a site that has 400 million results for 1 keyword and I'm an authority as uno 1 with extended links (right now anyways) then when I add one extra word to the keyword and the same page sinks to 950 hell?

This site as well as google is confusing the hell out of me.

Marcia




msg:3337786
 4:31 am on May 12, 2007 (gmt 0)

Just went through a MAJOR technology news pr 9 site that has been around for years and found a huge lupehole for spam.

How is that working and how is it related? Is it the news site that has the 950 penalty, or are there other sites involved that have the 950 penalty?

And what's right, reciprocal links or Adwords phrases?

steveb




msg:3337898
 7:22 am on May 12, 2007 (gmt 0)

"Are they saying there is suddenly over optimization on old pages?"

What's likely is your page is mistakenly having the penalty applied.

It's like the good title comment above. Properly titling a page can lead to "too much".

Google seems to want to penalize pages that APPEAR to have "too much" but in the process are penalizing many pages that legitmately have "a very large but appropriate amount".

mattg3




msg:3337952
 9:35 am on May 12, 2007 (gmt 0)


Totally confirms to me that I am caught up in this unsolicited linking to me with SEO on the pages ............

I never did link buying...

johnhh




msg:3337978
 10:35 am on May 12, 2007 (gmt 0)

penalizing many pages that legitmately have "a very large but appropriate amount".

I agree with that.

The problem is - as I have found - that you can have two pages with same design - same menu structure - very similar metatags - but different articles . One will be OK - one will be in -950

The problem is as annej says
it can be next to impossible to figure out what that other word/phrase might be.

and this is compounded when you add in a potential IBL factor - so you may correctly change all the words in an article only to have no improvement as for that page it is not a "phrase" problem but the lack of a couple of IBL.

annej




msg:3338106
 3:27 pm on May 12, 2007 (gmt 0)

OK, I'm going to put together some the things we have discussed and I think I can make some sense of it.

Steve addressed the problem of "too much" in titles and content. Basically seen as over optimization but probably borderline optimization.

Matt brings the concern that unsolicited linking (I assume from spam/scraper sites) is causing the problem.

John points out that "you can have two pages with same design - same menu structure - very similar metatags - but different articles . One will be OK - one will be in -950" He also mentioned the lack of IBLs.

Here is what I am thinking:

All of this is true but it only triggers the penalty/filter if there are certain words/phrases on the page! These phrases raise a red flag of some sort then a higher standard is set for any page with flagged phrases. So while much of a site my be unaffected certain pages are suddenly 950ed.

If the page gets a trusted link or two that can pull it out or if a few key words/phrases are removed from the page that can solve it. In theory just one word of phrase could cure the problem but only if you guess the right word/phrase. I have found in my cases cutting back on internal linking on the affected pages can in some cases help.

No easy answers but a few possibilities to try.

crobb305




msg:3338107
 3:37 pm on May 12, 2007 (gmt 0)

I am running way behind here, and trying to catch up on these -950 threads. As a starting place, I have a question for you guys:

Can this penalty occur on some phrases and not on others (e.g. one phrase -950, another #1; both equally "competitive")? Or, does this penalty characterize the whole page, and all phrases derived from it?

Also, does Gbot activity change?

annej




msg:3338125
 4:09 pm on May 12, 2007 (gmt 0)

Can this penalty occur on some phrases and not on others

Yes, a page can be 950ed on one search phrase and not on another.

crobb305




msg:3338136
 4:29 pm on May 12, 2007 (gmt 0)

I was reading the speculation about how scraper sites might HELP cause this penalty. Scraper sites are bad enough, but what do you do when a well-known whois database (with lots of PageRank) is providing spiderable copies of our index pages in association with the whois data for our domains? You could block the spider, but the damage may be done, especially if the whois page is already outranking the original for specific snippets of text.

This 195 message thread spans 7 pages: 195 ( [1] 2 3 4 5 6 7 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved