Forum Moderators: Robert Charlton & goodroi
I was just wondering if we could get an update from the people who said they recovered from the 950 Penalty. Are your sites still doing OK? Any new pages replace the old ones at the bottom of the SERPs?
And, do you still believe that the main cause for the penalty was overuse of the keyword/keyphrase on the page, or were there other issues that you feel contributed to it?
Thanks.
(My pages are still buried at the bottom, so the penalty is very much alive and well, and I suspect, lots of new sites are being added every day, they just don't know it yet. I'm seeing new sites listed at the bottom for some of the terms I watch, and they appear to be clean sites with decent PR.)
[edited by: tedster at 9:20 pm (utc) on Feb. 27, 2008]
As tedster said, Google is obviously sending us a message, but my crystal ball is not working, because the message isn't getting through.
For every thing I suspect is a problem, I can find another page that disputes it:
Overoptimized --> Pages with no optimization
Stagnant URLs --> URLs that are updated regularly
Page construction --> Pages with similar construction aren't penalized
Popular search terms --> My site is a niche site, wouldn't apply
Commercial --> Personal sites also hit
Affiliate links --> Pages without affiliate links also hit
AdSense --> Pages with and without AdSense, sites with and without are hit
Directory problem --> Some pages in directory still rank OK
If Google wants us to fix our pages, they're going to have to provide more information. So far, Matt Cutts is the only one at Google who has addressed this at all, and his response was that minor but different problems are hitting our sites, and that it's not related to the Googlebombing fix.
So, I'm going on the assumption that this isn't necessarily an on page issue, but a site wide issue, where multiple minor issues spread out sporadically over the entire site are somehow impacting random pages and/or directories. So, I'm going through my site looking for things that might seem suspicious to a search engine. I'm finding things and tending to them, but none of them are major.
I'm just wondering if they're all adding up cumulatively, and penalizing some of our pages while others rank OK is Google's way of telling us to clean up the nooks of our sites.
Who knows? At least it's keeping me busy, I sure don't want to add anything new until this is straightened out. (Don't want to add to the problem if the new page has something on it that Google doesn't like.)
I have tried everything in the book, trust me. Nothing makes ANY difference. I've changed titles and descriptions on individual pages, hundreds (not exagerating) of other A/B type tests between pages in a directory. Fresh content, fresh links makes no difference. Nothing makes a difference. When the directory is not penalized ALL pages are in the top 10, under penalty pushed past 90, but not 950.
I doubt that it's a spidering thing - because, for some people the same url will rank well for a different url. This keyword-specific behavior and the drastic nature of going to the end of the results are to of the most characteristic signs. The drastic quality makes me feel this might be some kind of "message" -- but what on earth is it saying, especially when some sites report that some of their urls go in and out of the "penalty".
As I think about it I don't think it is the spidering problem either. To many pages seem to be still in but partially. Like they will come up on the first results page for some search terms and not others.
I don't think it is related to the whole site or even the whole subdirectory. Now that I am looking for these things I'm finding more pages that are the only ones missing in their directory. But the ones I first discovered were all in one directory.
If Google is trying to send a message they are being way to subtle. I still think they are trying to eliminate spammy MFA sites and we are somehow getting caught in the net.
As I think about it I don't think it is the spidering problem either. To many pages seem to be still in but partially. Like they will come up on the first results page for some search terms and not others.
This would be true IF there were the old one index/one tiered system.
I believe that non-competitive keywords can compete quite well from the Secondary Index, while for more competitive keywords, the page MUST be present in Googles Primary Index in order to place well in the results.
As I have stated previously, I think site:search results and such are drawn from the Secondary Index.
Caryl
Nick
I've changed the anchor text on my contents pages to be different from either the page title or the H1 article title. It's hard to know if this is why some pages are coming back as some still have not.
The one experiment that I was hoping would help was taking the phrase I thought might be the problem completely off of a missing page. I did that days ago and I just checked and found the page is languishing down at 990. The page has been spidered a few days ago and the cache date is two days ago so the problem must be something else.
One of the interesting factors I note is that the url continues to be shown -- but at the very end. That is, they do not end up beyond 1,000. This is partly why I brought up the similarity to Local Rank. It seems like a second calculation is run, re-ordering the original urls for a given keyword but not discarding any.
I wasn't saying that Local Rank is "the answer", but that this end-of-results phenomenon has at least two strong parallels with LR - 1) keyword specific effects and 2) a re-ordering of the "natural" SERP. It looks to me like some secondary and keyword specific process is being folded in, one that is not exactly a filter .
Back in Part Two [webmasterworld.com] of this discussion, in message #:3233775, MHes shared some very detailed observations from a partially successful fix. In my opinion, these comments are worth a very close study by those who are struggling with this end-of-results "penalty".
We have had some success in getting pages back in but you fix one problem and create another. It is a multi layered issue, probably with different solutions for each site. We have managed to get some pages to rank top for some search terms whilst the site is going in and out. In order to get the whole site out requires a series of changes and time for all these to get processed.Most of the problems seem to be executed at run time, although offline there must be flags involved as well. The cache of our pages will show changes but older data is being used in the ranking. The dates of the cache are often misleading, we have done a few experiments with noindex and nofollow but despite new cache dates on pages the changes are not in sychronisation. This suggests a degree of offline analysis.
Areas that we have had to tackle in order to get a page to move from 950 to 20th in a predictable way (despite untouched pages fluctuating between 950 and 400+) are:
1) Local rank issues. Very clear effect was seen and cured.
2) Anchor text issues. Combined with local rank it had a devastating effect.
3) Trustworthyness of link. Purely on page factor that effects how the link/anchor text is treated plus (possibly) words around the anchor text. There is a lot of 'ignoring' going on if onpage factors offend.
4) Page clustering and 'similarity'. Ruthless ignoring of pages deemed similar and potential spamming of a phrase.
Two pages unrelated to each other but in combination covering a search term in a comprehensive way will do well. This is done mainly at run time and the &filter=0 helps identify 'clusters'. Some degree of offline analysis is evidently done but the search phrase effects the application of the filter, although adding &filter=0 will not change the effect. The offline analysis has to be done before changes will be seen in the way the clustering has been effected.... this leads on to a different ranking behaviour for the page.
I think Matt was telling the truth when he said at Christmas that no new algos were present. This 950 episode is all about a combination of existing algos being tweeked to work together in a more dramatic way.
[edited by: tedster at 7:23 am (utc) on Feb. 5, 2007]
What about the way that local rank gets combined with initial rank. Perhaps this very last step is weighted more heavily for urls that diverge too dramatically from some norm, even with a threshold of some kind that says "if your this far off the norm or further, we'll throw out your initial rank altogether."
It may well depend on the "profile" and history of the site that's linking.
[edited by: Marcia at 7:46 am (utc) on Feb. 5, 2007]
How much does internal anchor text have to do with this?
Let's say I have a page about Acme Widget Model 123. The title is "Acme Widget Model 123". Anchor text in links within my site would normally say "Acme Widget Model 123" to identify the topic of that page.
However, there is a main Acme Widget page that has a link with anchor text as "123" since it's understood that the 123 refers to an Acme Widget, since that's all that page is about. Other links on that same page would be identified as "121", "122", "124", etc.
Is this a good thing? To repeat anchor text on one page as:
Acme Widget Model 121
Acme Widget Model 122
Acme Widget Model 123
Acme Widget Model 124
seems spammy to me. And inbound links from other sites, it would seem, should be "Acme Widget Model 123" since that is what the page is about. Is it best to have less uniformity and more randomness?
Are the following examples random enough:
Widget Model 121
Acme Widget 121
Widget Model 121
Acme 121 Widget
And I guess I don't understand how a link from an unrelated site could be beneficial. Didn't Google just roll out anti-Googlebombing to put a stop to this? So if you have a totally unrelated site about hair rollers, and it links to your Widget site, assuming a Widget has nothing to do with hair rollers, how could that be helpful?
Another question about local rank: In a competitive industry, would it not be unusual to have competitors linking to each other? So if you take the first 1,000 results on a search term, let's say in a very competitive industry, if those pages are all competing companies, it would be unusual to find a lot of linking to competition going on. After all, if you're trying to sell insurance, you don't want to refer visitors to another insurance site. So, how would local rank be applied in this case? It would seem very few peer sites would be linking to each other.
[edited by: AndyA at 3:39 pm (utc) on Feb. 5, 2007]
even with a threshold of some kind that says "if your this far off the norm or further, we'll throw out your initial rank altogether.
That’s an interesting thought, you tend to think about Google and ranking as sort of a continuum; the most relevant first, then each succeeding site ordered upon how relevant the algorithm thinks they are to the query (or attempts to anyway).
Here you may have a situation where if your score adds up in a certain way, there’s a cut off and you get discounted to the very end; doesn’t matter how close you came to the cut off, the results the same; end of the line.
The question is why? What’s the point of doing it that way? We all are seeing sites that are being sent to the very end, that would in fact be more relevant, and of greater benefit to the searcher, than a minimum of 7 or 8 hundred other sites. I’m appreciative of how Google goes about their business, and certainly through the long haul the results have been impressive, but, at least to me, this approach has caused a real decline in the quality of the rankings.
Perhaps there’s more to come but its beginning to look like this is a permanent part of the ranking process.
I know I've found one phrase that is in several though not all of my missing pages.
If it is indeed this I think there is a thin line between whether the page is penalized or not. Looking over my pages it seems to be more damaging when the phrase is in the page title, H1 title and/or internal anchor link. In fact one page that is doing well has the phrase more times in it than any of the penalized pages but the phrase is not in any of those places I listed above.
My rationale is that there may be sites that rank well for many terms including terms that are not accurate.
Since Google's algorithm is far from perfect, they are most likely inclined to hide a site that should not rank for even one single inaccurate term, so they penalize the site for multiple to all terms and drop it to the bottom of the results.
That's one theory.
If the sites here all ranked for extremely competitive terms, and a few that it made no sense to rank for, then this is one possibility.
The site has a lot of unnecessary font tags, and I have been able to reduce the size of the HTML on some pages from 41kb to 12-13kb (or less) just by cleaning up the HTML, which gives you an idea of how bad some of the pages are. I'm going to go back and retrofit css on these pages as well, which should reduce the sizes even further, but they do not feel that alone is causing the issue, as other code heavy sites still rank well.
They advised me to add a paragraph or two to my home page telling what the site is about. I have bullet points there now, but they feel a bit more spider food on this important page will be helpful. I think someone else mentioned this in an earlier post here as well. So, I plan on doing that.
They suggested reducing the number of keywords and keyphrases, even though the pages that have been hit read well, and don't seem to use those words too often. They said a human reviewing it would likely not think I had too many uses of the words, but to an algorhythm it could raise some flags.
About 200 pages were found with no meta descriptions, although they did have unique page titles. I knew about those, and am adding the descriptions as fast as possible.
Overall, they said since my site does have some #1 SERPs in Google, and some of them for fairly competitive terms, and since it is doing well in MSN, Yahoo, Ask, etc., that tells them it's close to being what it needs to be and the site as a whole isn't being heavily penalized. They fell short in offering anything specific as to why exactly some of the important pages are at the bottom of the SERPs, other than to advise me to continue cleaning up the site and slowly adding new content to it.
We discussed the Matt Cutts post on his blog where he indicated without actually saying so in so many words that the bottom of the results thing was due to multiple minor issues on the site, which impact its overall value to searchers, such as pages that aren't finished that don't provide the information that they should, lack of unique content, and offering nothing special, just a repeat of info that lots of other sites already offer.
So, without anything new and specific to check, I'm going to just keep doing what I've been doing.
I get the feeling that with the new links tab in Webmaster Tools, Google has diverted attention somewhat from some of the issues (like this one) with its SERPs. It's kind of like, "Look! Over there! A bright shiny thing. See?"
At any rate, I would sure appreciate any additional information or ideas on the 950 Penalty, as it appears to me some sites are getting out of it without making any changes whatsoever, and I am still seeing new sites added every day, as well as sites like mine that appear to just have pages stuck there. This is apparently something we're going to have to live with going forward, unless there is a problem with Google's algo and it's taking them a while to roll out a fix.
So, if this is indeed a 950 Penalty, why in the heck would adding one more word to the search remove the penalty completely?
I know there must be something to learn from this, but I sure don't get it.
It would suggest that the page is being penalised with respect to specific phrases only, not a general penalty. That would imply that the problem is specific to the way the page uses the phrase. This could be either the page content or backlinks.
I see the same thing for my pages: "widget" is penalised, but the page lists number-one for "widget reviews".
I must confess I did find two pages that had identical meta descriptions to another page in that section. But I can’t understand how this would have affected so much of my site.
why in the heck would adding one more word to the search remove the penalty
We’ve been talking about penalized pages but I think it’s more a matter of penalized phrases. Or that certain phrases or combinations of phrases affect it somehow.
With some of my missing pages I’ve been able to get to come up for other search phrases related to the article taht puts the page back in the top 10. Phrases that are other than the page's main topic. I was hoping that would help me figure out what I’ve done wrong on the missing pages but I can’t make any sense out of it.
That's exactly what I'm seeing. A combination of having half of my site removed from the index and 90% of the rest put in the SERPs.
The bottom line is just about every time I was *ranked* for is now end-of-results, even if there was no link anchor or optimization for those words. It's like Google took a set of related words that it probably penalized me for and dumped me for the whole lot of them.
I mentioned in the other thread, but no one has said anything about lack of Googlebot activity. I'm seeing it barely crawling my site now with just a few visits/pages accessed every day... and declining.
If a site has a couple thousand pages and only a couple dozen outgoing links, are you saying that increasing the number of outgoing links might actually improve the rank in Google?
I never thought that too high of a percentage of internal links could hurt a website's ranking, but maybe it's possible.
----------
I wanted to point our that the discussion in our February 2007 Google SERP Changes [webmasterworld.com] thread no has several reports of what sounds like this "950 penalty", but the drop in ranking is less severe - only 120, or 300, or 400 etc. everything else sounds the same - keyword related, previously top ranked url, etc.
I think this reinforces the idea of a "last step re-calculation of ranking" being involved. That theory didn't make as much sense if every recalculated ranking was thrown to the end of results.
No links to it, nothing.
Well I just found that some scraper sites ahd linked to it, and there were actually some 'dubious' posts. Not many, like 5 at most but still.
I deleted it now, but perhaps its one more of those 'minor' things that Cutts was talking about.
I've gone through my site looking for other junk things that were uploaded and never used, and removed a couple.
Something for others to consider..Check any old scripts / uploads! THey may be indexed..
I know words like "free," "as seen on TV," etc., can be a problem but I have nothing like that on my site.
I have been going over it pretty thoroughly cleaning up very minor things, but I haven't found anything even remotely serious so far.
< discussion continues here: [webmasterworld.com...] >
[edited by: tedster at 9:09 am (utc) on Feb. 8, 2007]