| This 226 message thread spans 8 pages: < < 226 ( 1 2  4 5 6 7 8 ) > > || |
|Google's 950 Penalty - Part 8|
< continued from [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >
I don't think this is related to reciprocal linking. If it is overdone to the point that Google sees a big red flag some other penalty might click in but not this 950 thing.
Phase based seems a lot more likely. And there may be something about the words or phrases used in internal linking involved as well. Or that could just be a part of the phrase based thing.
[edited by: tedster at 9:15 pm (utc) on Feb. 27, 2008]
Great now I discovered a cache with my sites (some one that used nutch and didn't dissallowed the cache in robots.txt.... ) loads of duplicated files on PHPSESSID and so on :\
the cache date is from 24th November 2005, about the time we sank the first time and stayed there for a year.
“and btw thanks for the hint ... linked by bad neighbourhood. I discovered in the webmastertoolkit...”
OK, I just went through our links again using webmastertoolkit and discovered a bunch of keyword/phrase sites linking to a directory and page that has been removed since 2002. The sites are mostly for some online casino garbage. The removed page jogged my memory about having seen some traffic showing up as 404 in my stats for the page. I went and checked my .htaccess and discovered to my chagrin that I had placed a 301 redirect from the removed page to our index page. (I know, how stupid can I be right…dumb, dumb, dumb.) Funny thing is that our 950 seemed to kick in about two weeks after I uploaded the new .htaccess.
1. Could this be playing a role in 950?
2. Did I stupidly give weight to these unsolicited inbounds by redirecting them to our main keyword page?
3. Would it do any good now to remove the redirect and let the inbounds just 404?
4. Any other suggestions?
Going to go find the nearest wall now and bang my head REAL HARD!
This is how stupid the 950+ penalty is:-
One of our sites has a very detailed and highly researched section on “Blue Widgets” – The section is referenced by some .gov sites, has some great links to it and is often found listed on various book marked sites. I like to think it’s a great piece of researched information, probably the best on the net.
The section is detailed textual content with an index on each page for the 14 pages related to the subject matter, sections relating to “Blue widgets” like:-
How to yada yada a blue widget link
Yada yada link
Using a widget link
In fact half of the 14 links to information contain either the keywords “blue” or “widget” but in every case they are relevant and without them the links wouldn’t make sense to the end user. Each page has the title tag stating the page name with/ without the keywords as applicable and the pages have the standard heading tags again the same.
Currently the page ranks in both MSN and Yahoo as the authority on “Blue Widgets” yet in goggle it’s in the 950+ bin. Meanwhile in Goggle some two bit sites have mashed together a one page item about the blue widgets with some affiliate links on and they rank top of the serps in place of my authority section.
The result is that for the page to rank well in google, I now find that I need to do SEO for google, i.e. de-optimise the layout, reduce/ remove some of the pages etc etc and the links to them in order to rank.
Prior to the introduction of the 950 google would put up the most relevant page of your site, now it looks at it and says if you have many pages about a subject it must be spam so lets 950+ it.
The problem now is do you reduce the quality of your subject matter and work on seo for google or do you leave it and concentrate on doing what’s good for your visitors?
– You can have the best website on the planet but if no one finds it in google you are wasting your time – so google are now actively saying to webmasters “spend time on giving us what we want rather than your visitors if you want to be listed”.
Frankly it’s a Joke
|Prior to the introduction of the 950 google would put up the most relevant page of your site, now it looks at it and says if you have many pages about a subject it must be spam so lets 950+ it. |
I've had the opposite experience, so I can't help wondering how you can be so certain of the reason for your drop in rankings (or your -950 penalty, if that's what it is). Is it possible that something else might be afoot?
Pure HTML pages are usually safer than modern pages. Google seems unable to keep up with technology. Maybe they even rate pure HTML different since the owner is more likely less technology savvy and not a threat.
Add to that HTML errors and frontpage and you have the mum and pop scenario.
Note the less "LIKELY"... exceptions are part of the rule.
But an interesting question how many 950 have pure Frontpage et all HTML pages 1998 style. I like to hand code sometimes 1995 vi style although not on the affected site which is mostly straightforward LAMP. HTML hp still ranks #1
matt: In the stonetemple interview you can read they don't penalize sites for different html/coding.
|In the stonetemple interview you can read they don't penalize sites for different html/coding. |
Yes on the validating point, but not on the technology point. I didn't read that. For example .html vs .php vs.jsp
A guess from me without data would also be that something coming from an MS server is likely less spammy, in a malicious sense ;), than from an apache server.
My general impression of the serps is that .html seem to do rather well. But I am no SEO ..
Matt, my sites are all hand coded HTML and CSS and I have still lost some pages to the 950 regions. So I'm not sure how modern the site is matters.
Rich, you describe exactly my frustration. It doesn't seem to matter how well researched and written the page is it can still be struck. That's why I think it's related to this phrase based thing and sometimes it just doesn't make sense at all what can be wrong. I have taken off all but bare minimum internal links on a few pages in order to get them back. It seems it was something about the related internal links but I can't tell exactly what it is so I'm just not doing as much internal linking. I hate to drop internal links that would have been helpful to the visitor but like you say, if the page can't be found in Google a lot of people searching for a topic will never find it. Instead they may well find one of those annoying MFAs with a minimum of real information and some automated links.
Interestingly my site (950'd) contained no internal links within the content. The only internal links were in the navigation - can't really do without those but I've made sure all the destination pages are completely on topic i.e. blue widget link points to a page that talks about blue widgets only. I've not mentioned any other widgets on that page(red, green or yellow).
It seems to be hard finding a common reason for all this 950'd sites...
What does googles site:...-operator tell? Is your index the first page to be listed there?
In my case a different page was #1 on the site:... listing and was shown as the page with highest PR in webmastertools console...
And that was caused by a sudden growth of inbound links for that special page. Now that the problem is fixed (link is pointing to the domain, not to a special page anymore), my index is listed again at #1 on site:-listing and things start getting slightly better.
I can't really believe removing internal links to the limit is what Google wants to find... but who knows?
Just a suspicion, has anyone seen similar things?
Okay, then it's time for another post repeating myself. First of all, there is no -950 penalty as in there isn't a single factor causing this, we have concluded this before. New filters include the scraper/spam detection which looks for "too many themes" or "money phrases", another thing is while TrustRank is still being passed for irrelevant keywords in the anchor, and/or for another theme than the source / anchor / target is relevant to, it gets overruled by another relevancy check. Practially speaking, it's not generic anymore. Third thing, is that if a site ( trusted site ) tries for a money / competitive / popular phrase it is not trusted for ( doesn't have trust on this theme, either to the page, or to the homepage, or the homepage doesn't pass it on with a relevant anchor ) it won't be able to rank anymore.
- If your page has 100 .gov, .edu and other trusted links to it, and YOUR page has the keyphrase in the title and its content, and is -950, then either the ANCHOR TEXT does NOT have it, not a single instance, or the pages that link to you are not relevant in the eyes of Google, i.e. are not linked to with these words, not even a derivation, perhaps not even on theme, not a single instance.
- If your page is NOT linked from the outside, it relies on navigation for relevancy. Let's say the .gov sites link to the homepage, with a generic anchor that's relevant to it. If the internal links pointing to the subsection that's having problems aren't relevant to this phrase ( the anchor text isn't a match, the homepage isn't relevant to the subpage, or should be but the theme isn't recognized ) they will NOT BE ABLE to rank for competitive keyphrases.
Example: page is linked to from .gov site with "My-city Information". The .gov page its linked from is not linked to with either of these words. Not a single reference shows that it would be relevant for them, except on-page factors. Three months ago, this would have been enough for it to be able to pass on the parameters to the page it links out to. Now it ain't.
Example: Homepage is linked to with "My-city Information", to which both the source and the target is relevant to. Homepage then links to subpage with "Abracadabra", which is a competitive phrase, but Google soesn't "get" the relation. It sees the anchor as irrelevant to "My-city Information", and thinks to itself that a high trustrank page is trying for something off-topic. As far as the algo is concerned the source page could be about "Buy Widgets", for it's just as "irrelevant". Page won't rank.
- TrustRank and relevancy are tied together. This is a new thing.
Being relevant for a competitive phrase has the criteria of:
- Having off-page factors supporting the page
- Either internal or inbound links
- Page having the phrase in its content
- If you want to rank well, then have it in the title too
IF your page has a competitive phrase,
But doesn't have the support of its incoming anchor text,
or Theme isn't recognized by the AI,
or Words don't appear on page,
or Not a single link refers to the page with the exact keyphrase,
or The natural linking pattern is outmatched parameter-wise by a 3rd party attack,
or The page has another 300 competitive phrases that it doesn't have support for,
or It has 300 off-topic outbound links which you forgot that are TEXT and analyzed all the same for relevance...
Your page will either end up -950 for the given phrases, or drop out. Have the problematic phrase(s) in your navigation, and your entire site will end up -950.
Have too many colliding themes, a lot of targeted phrases that you don't have the relevancy for, and you'll get another penalty.
Until Google figures out an AI that can tell when does "US" mean "we" and when does it mean "United States" ( depepnding on the context ), you better think twice before using it in the website navigation ( not kidding ).
Don't use too generic keyphrases.
Don't use too specific ( unrecognized ) phrases that include a word which has a high treshold for TrustRank. ( ie. if you have a blog about New York, don't use the phrase Posts from New York ).
Stop considering TrustRank as if it was the generic offset value it was, back in 2005-06. This is a hack ( its not coded into the algo yet ) and that's why a runtime "reranking" or "penalty" or "filter" or "insert term here" has to do the job. It grabs the results that the so-far algo produced, and based on the judgement of this add-on program, it will get rid of pages of which are either not linked to, are linked to but not with a relevant anchor text, are linked to properly but the pages they're linked from are irrelevant ( or not linked to properly ), or are "outlinked" by an attack that washed away their natural linking pattern, and made them more relevant for another theme that's not even on page. ( unlikely to happen with a PROPERLY set up link profile )
Not sure if this was to battle off bought links, but it must do a pretty good job at it none the less. Although it wipes out a lot of legit sites with the process.
And makes proper SEO even more necessary.
Well, anyway, sorry annej that I haven't replied earlier, yes, having the phrase on the homepage ( if it's relevant to the homepage ) is better, even if it's but a link to the proper subsection. Making the source more relevant to the target will pass more parameters with the anchor too.
Heh, you know what I think. They randomly pick some sites and take them out to create some confusion. Rule through fear.
There seems to be too many different reasons and people keep guessing with crazy theories. This being one of them :)
<tin foil hat off>
[edited by: Crush at 1:28 pm (utc) on May 3, 2007]
It makes sense not to rank your travel site just because it's linked with "click here" from harvard.edu. Or with "Travel My-city" from deep within the same domain, but from a page about genetic engineering on mice.
Or not to allow your 30.000 IBL'ed electronics retail site to be the no.1 result for whatever it wants, just by adding a new, but irrelevant item into its navigation saying "NBA tickets". As I've said before, TrustRank became thematic in its effect.
Look up some SERPs, check the sites -950'd, look at their linking profile, the pages they're linked from, the anchors they're linked with, and their internal navigation.
There wasn't a single point in the above post that I didn't test. Have you tested it?
[edited by: Miamacs at 1:45 pm (utc) on May 3, 2007]
|Heh, you know what I think. They randomly pick some sites and take them out to create some confusion. Rule through fear. |
There seems to be too many different reasons and people keep guessing with crazy theories. This being one of them
I like that conspiracy theory ;-)
Should be a simple thing to add to the algo... allow a random selection of sites to be downgraded now and then simply to throw people off balance. Imagine the consequences - nobody would have any real claim to knowing SEO!
Miamacs, your observations line up with mine all the way. An excellent summary, and thanks for sharing such detailed results from your tests.
Now if only I could get searchers to use "USA" and not "US" ;)
Miamacs, thank you very much for your summary...
One thing I think about is, if you put your phrases and keywords into anchors and titles and the content as well, couldn't that end up as beeing seen as some kind of keyword stuffing by Google, which probably would cause another penalty?
Sites could easy end up "over optimized". Let's take a look at the average non-SEO Webmaster to launch a new site...
Buisiness sites often introduce them by some general marketing-bla like "We provide the right solutions for your problems", on some pages deeper maybe they let us know that exactly this blue widget would be the right solution.
I observe many competitors who seem to violate all your posted points and still stick like clue on their top positions wthout having seen any 950-penalty.
In the guidelines for webmasters launched by Google you can read that you should build up your site in a manner as if there were no search engines, let's say in a "natural" way. Respecting all your points would mean that only those sites will rank well who respect this restrictive rules, otherwise they're 950'd. This is something I can't see taking a look at well ranking competitors as mentioned previously and that is keeping me still in doubt what to do next.
To me things are not that clear, especially since sites came back without having anything changed...
[edited by: Pancho at 3:52 pm (utc) on May 3, 2007]
Someone have noted a change of SERPS yesterday and today?
My -950 site are now at 814 position for various primary keys and have less supplemental than last week.
Filter is giving us a little bit of air?
Yes, can see the same...
I never had any page suplemental...
Not only the anchor text which links to your site needs to be relevant to your theme but additionally the theme of the linking page needs to fit your theme *exactly*. It's important that the linking page contains your theme in the title.
Well I got 1200 9 year organic growth IBL's from pages about widgets to my widget site.
Besides the 500 domain spammers that have something in it from 1997 ... which was right then but not now. :\
I read through your long post Miamacs and I fail to see how a site can recover from receiving too many links (from external sites) with relevant anchor text but an irrelevant theme for the linking page. For example, if I receive a bunch of links with "blue widgets" as the anchor text but from an off-topic site, it gives my site a -950 penalty.
My site is all about "blue widgets" and I am linked to from authority sites for "blue widgets", yet these hundreds of off-topic spam links pull my site down. Short of emailing the off-topic sites that link to me and asking them nicely to please remove the link to my site, there isn't anything I can do.
Excellent summary Miamacs - it really helps...
|how a site can recover from receiving too many links (from external sites) with relevant anchor text but an irrelevant theme for the linking page |
There's some anecdotal evidence that a few more links from on-theme pages can push you back under the penalty threshold. In other words, it's more of a ratio than an absolute value.
errorsamac, I suggest a trusted high PR on-theme page link.
Just for clarification for broader sites, a quick example as a question: If you have a site about authors and then you have 1000s of subpages for each author, then what's thematically correct? Do you really need 1000s of specific and deep IBLs from 1000s of author fan sites to not be pushed over the penalty edge? So even though your entire site is dedicated to various authors some individual author pages are deemed irrelevant because there's no IBLs to them, only to other different authors?
In the previous incarnation of G this wouldn't have happened as thematically your site was about "authors" and you had enough TrustRank to justify the quality of the pages without IBLs. Is it the case now, or am I misreading the situation?
Here's one I would like to get opinions about. Not my site, but one that I find very interesting. Actually in the result set for this keyword there are a couple of interesting 950 examples... including some Fortune 500 companies suffering. I'll do my best to not violate TOS, but I'll be getting close with this post... so sorry in advance if I cross the line.
Do a search for "car loans". (no quotes) At about result result 15 or so, you will find the .com for the keyword. Also note the Capital One Homepage, and another domain for Capitol One.
Now, just as an aside, if you check the 950 results for that query, you will find the main page about car loans that Capitol One links to through its navigation. Obviously this internal page is suffering from 950.
But.. now do a search for "auto loans" (no quotes) and go to the 950 area. You will see that .com domain for the original search term "car loans" there. Now, I have enough experience in the auto field to know that google makes no real distinction at all between the words "car" and "auto". I have always ranked about the same for any query using either term in substitution for each other... at least in the insurance niche.
So... why would the .com be suffering from 950 for query "auto loans", but a top 20 result for "car loans". I know that being the .com for the query helps it for that search term, but I can't explain the 950 on the synonym.
Just thought these two would be good examples to throw out for discussion.
For the record, the Google Forum Charter [webmasterworld.com] prohibits "The mention of particular search terms" and "Any information that leads people to find a specific site, yours or anyone else's."
In the interest of this particular thread I am going to allow a one time exception to this policy and discuss the above information. It may help some light bulbs go on for people who are looking closely at their own market but would benefit by seeing a bigger picture. It may also help the doubting Thomases who still don't think that something quite odd is going on.
But please, no other search terms in this thread -- thanks.
I'm wondering, just how tight do people think this theming has to be? Is it good enough that pages about widgeting link to your widgeting pages or does it have to be "red wiggly widgeting" from other "red wiggly widgeting" to count for anything.
Sorry to dissapoint but themeing has nothing to do with it. I have dozens of sites with links from non themed pages with literally 100 of thousands of links. Only one site I have has the -30.
You all give google to much credit IMO. Google is a machine that indexes pages and recognises content. How can it possibly theme? It would need to know what every page meant semantically. If I have a site on "cool links" and I link to everything on different topics because the sites are "cool". There is no theme there or at least only one that human could recognise.
The google engineers must love reading this wild speculation. If you mess with blackhat you would understanding why themeing does not exist.
[edited by: Crush at 5:38 am (utc) on May 4, 2007]
Re: auto car
auto is ambigous car isn't
The algo doen't seem to get that an auto loan is unlikely an automatic loan.
It all comes down to that AI as yet isn't that advanced in general and especially if applied to billions of pages.
We have a site about widgetery all people link to us because it's the main site about widgetery AND all it's subtheme. Yet only the term widgetery remains on page 1 all subthemes are 950ed.
This algorithm doesn't seem to get the concept of subthemes then..
"Do you really need 1000s of specific and deep IBLs from 1000s of author fan sites to not be pushed over the penalty edge?"
Of course not. There is nothing about what you describe concerning authors that would put you at any particular risk from the 950.
| This 226 message thread spans 8 pages: < < 226 ( 1 2  4 5 6 7 8 ) > > |