| This 60 message thread spans 2 pages: < < 60 ( 1  ) || |
|Google's 950 Penalty - Part 3|
What do we know about it, and how do we get out of it?
< continued from [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >
I was just wondering if we could get an update from the people who said they recovered from the 950 Penalty. Are your sites still doing OK? Any new pages replace the old ones at the bottom of the SERPs?
And, do you still believe that the main cause for the penalty was overuse of the keyword/keyphrase on the page, or were there other issues that you feel contributed to it?
(My pages are still buried at the bottom, so the penalty is very much alive and well, and I suspect, lots of new sites are being added every day, they just don't know it yet. I'm seeing new sites listed at the bottom for some of the terms I watch, and they appear to be clean sites with decent PR.)
[edited by: tedster at 9:20 pm (utc) on Feb. 27, 2008]
I've seen Googlebot spider my pages that are hit with the 950 Penalty. I keep watching to see if they recover, but they don't.
As tedster said, Google is obviously sending us a message, but my crystal ball is not working, because the message isn't getting through.
For every thing I suspect is a problem, I can find another page that disputes it:
Overoptimized --> Pages with no optimization
Stagnant URLs --> URLs that are updated regularly
Page construction --> Pages with similar construction aren't penalized
Popular search terms --> My site is a niche site, wouldn't apply
Commercial --> Personal sites also hit
Affiliate links --> Pages without affiliate links also hit
AdSense --> Pages with and without AdSense, sites with and without are hit
Directory problem --> Some pages in directory still rank OK
If Google wants us to fix our pages, they're going to have to provide more information. So far, Matt Cutts is the only one at Google who has addressed this at all, and his response was that minor but different problems are hitting our sites, and that it's not related to the Googlebombing fix.
So, I'm going on the assumption that this isn't necessarily an on page issue, but a site wide issue, where multiple minor issues spread out sporadically over the entire site are somehow impacting random pages and/or directories. So, I'm going through my site looking for things that might seem suspicious to a search engine. I'm finding things and tending to them, but none of them are major.
I'm just wondering if they're all adding up cumulatively, and penalizing some of our pages while others rank OK is Google's way of telling us to clean up the nooks of our sites.
Who knows? At least it's keeping me busy, I sure don't want to add anything new until this is straightened out. (Don't want to add to the problem if the new page has something on it that Google doesn't like.)
JerrryRB - Do pages in the new directories have similar keyword density as the ones that are "penalized"? And are you searching for them with the same keyword combinations, or at least keywords that are just as popular/competitive?
Keyword densities are the same site-wide. I actually reduced my main keyword density site-wide about 3 weeks ago, and it helped, but on Yahoo. LOL. The searches are very similar, the same term just a different type of "widget".
I have tried everything in the book, trust me. Nothing makes ANY difference. I've changed titles and descriptions on individual pages, hundreds (not exagerating) of other A/B type tests between pages in a directory. Fresh content, fresh links makes no difference. Nothing makes a difference. When the directory is not penalized ALL pages are in the top 10, under penalty pushed past 90, but not 950.
|I doubt that it's a spidering thing - because, for some people the same url will rank well for a different url. This keyword-specific behavior and the drastic nature of going to the end of the results are to of the most characteristic signs. The drastic quality makes me feel this might be some kind of "message" -- but what on earth is it saying, especially when some sites report that some of their urls go in and out of the "penalty". |
As I think about it I don't think it is the spidering problem either. To many pages seem to be still in but partially. Like they will come up on the first results page for some search terms and not others.
I don't think it is related to the whole site or even the whole subdirectory. Now that I am looking for these things I'm finding more pages that are the only ones missing in their directory. But the ones I first discovered were all in one directory.
If Google is trying to send a message they are being way to subtle. I still think they are trying to eliminate spammy MFA sites and we are somehow getting caught in the net.
Hrm, could this be related to over-optimising inbound anchor text...?
[edited by: Nick0r at 7:42 pm (utc) on Feb. 4, 2007]
|As I think about it I don't think it is the spidering problem either. To many pages seem to be still in but partially. Like they will come up on the first results page for some search terms and not others. |
This would be true IF there were the old one index/one tiered system.
I believe that non-competitive keywords can compete quite well from the Secondary Index, while for more competitive keywords, the page MUST be present in Googles Primary Index in order to place well in the results.
As I have stated previously, I think site:search results and such are drawn from the Secondary Index.
"Do I dump my current directories and create new directories and new pages in those directories?"
Creating new pages on the same topic ina different location on the website usually leads to the new pages being penalized, and I only say usually because it is 100% of the time in my experience.
Ahh, I see it now. I can get all my missing pages with site:search. Some of my other search phrases could be in the same situation.
I've changed the anchor text on my contents pages to be different from either the page title or the H1 article title. It's hard to know if this is why some pages are coming back as some still have not.
The one experiment that I was hoping would help was taking the phrase I thought might be the problem completely off of a missing page. I did that days ago and I just checked and found the page is languishing down at 990. The page has been spidered a few days ago and the cache date is two days ago so the problem must be something else.
"Honi soit qui mal y pense"
Interesting experiment, annej. Very suggestive of a backlink issue, no? In the one case I've heard of who seems to have ecaped from this end-of-results thing, a few new backlinks that included the keyword was one of several steps they took.
One of the interesting factors I note is that the url continues to be shown -- but at the very end. That is, they do not end up beyond 1,000. This is partly why I brought up the similarity to Local Rank. It seems like a second calculation is run, re-ordering the original urls for a given keyword but not discarding any.
Ted, can you explain a little more about how Local Rank might be involved. I understand how it works more or less but don't see how it would be involved here.
In Local Rank, the algo first turns out an initial set of 1,000 results for a keyword. Then all backlinks that do not come from within those 1,000 results are thrown away, and a new ordering is created as if this small subset of sites were the only sites that exist. It's a sort of "jury of your peers" thing, and those results can then be factored into the original ordering of the SERP, boosting or devaluing any given url.
I wasn't saying that Local Rank is "the answer", but that this end-of-results phenomenon has at least two strong parallels with LR - 1) keyword specific effects and 2) a re-ordering of the "natural" SERP. It looks to me like some secondary and keyword specific process is being folded in, one that is not exactly a filter .
Back in Part Two [webmasterworld.com] of this discussion, in message #:3233775, MHes shared some very detailed observations from a partially successful fix. In my opinion, these comments are worth a very close study by those who are struggling with this end-of-results "penalty".
|We have had some success in getting pages back in but you fix one problem and create another. It is a multi layered issue, probably with different solutions for each site. We have managed to get some pages to rank top for some search terms whilst the site is going in and out. In order to get the whole site out requires a series of changes and time for all these to get processed. |
Most of the problems seem to be executed at run time, although offline there must be flags involved as well. The cache of our pages will show changes but older data is being used in the ranking. The dates of the cache are often misleading, we have done a few experiments with noindex and nofollow but despite new cache dates on pages the changes are not in sychronisation. This suggests a degree of offline analysis.
Areas that we have had to tackle in order to get a page to move from 950 to 20th in a predictable way (despite untouched pages fluctuating between 950 and 400+) are:
1) Local rank issues. Very clear effect was seen and cured.
2) Anchor text issues. Combined with local rank it had a devastating effect.
3) Trustworthyness of link. Purely on page factor that effects how the link/anchor text is treated plus (possibly) words around the anchor text. There is a lot of 'ignoring' going on if onpage factors offend.
4) Page clustering and 'similarity'. Ruthless ignoring of pages deemed similar and potential spamming of a phrase.
Two pages unrelated to each other but in combination covering a search term in a comprehensive way will do well. This is done mainly at run time and the &filter=0 helps identify 'clusters'. Some degree of offline analysis is evidently done but the search phrase effects the application of the filter, although adding &filter=0 will not change the effect. The offline analysis has to be done before changes will be seen in the way the clustering has been effected.... this leads on to a different ranking behaviour for the page.
I think Matt was telling the truth when he said at Christmas that no new algos were present. This 950 episode is all about a combination of existing algos being tweeked to work together in a more dramatic way.
[edited by: tedster at 7:23 am (utc) on Feb. 5, 2007]
Here's some more speculation. One related type of caculation might be to look at "local links" from a different angle - such as are there too many "only local" links in the backlink profile (an overly incestuous linking profile). Or perhaps the anchor text is also one of the criteria that gets factored in at this stage.
What about the way that local rank gets combined with initial rank. Perhaps this very last step is weighted more heavily for urls that diverge too dramatically from some norm, even with a threshold of some kind that says "if your this far off the norm or further, we'll throw out your initial rank altogether."
An on-topic link from an on-topic page on a site that doesn't rank for the particular word or phrase in question and isn't specifically on the topic of the target site can still pull a page out of this penalty. I've seen it happen.
It may well depend on the "profile" and history of the site that's linking.
[edited by: Marcia at 7:46 am (utc) on Feb. 5, 2007]
I would add that local rank may be applied to a cluster of pages within the same site. The old rules of local rank may now apply within a site as well.
Regarding local rank and linking:
How much does internal anchor text have to do with this?
Let's say I have a page about Acme Widget Model 123. The title is "Acme Widget Model 123". Anchor text in links within my site would normally say "Acme Widget Model 123" to identify the topic of that page.
However, there is a main Acme Widget page that has a link with anchor text as "123" since it's understood that the 123 refers to an Acme Widget, since that's all that page is about. Other links on that same page would be identified as "121", "122", "124", etc.
Is this a good thing? To repeat anchor text on one page as:
Acme Widget Model 121
Acme Widget Model 122
Acme Widget Model 123
Acme Widget Model 124
seems spammy to me. And inbound links from other sites, it would seem, should be "Acme Widget Model 123" since that is what the page is about. Is it best to have less uniformity and more randomness?
Are the following examples random enough:
Widget Model 121
Acme Widget 121
Widget Model 121
Acme 121 Widget
And I guess I don't understand how a link from an unrelated site could be beneficial. Didn't Google just roll out anti-Googlebombing to put a stop to this? So if you have a totally unrelated site about hair rollers, and it links to your Widget site, assuming a Widget has nothing to do with hair rollers, how could that be helpful?
Another question about local rank: In a competitive industry, would it not be unusual to have competitors linking to each other? So if you take the first 1,000 results on a search term, let's say in a very competitive industry, if those pages are all competing companies, it would be unusual to find a lot of linking to competition going on. After all, if you're trying to sell insurance, you don't want to refer visitors to another insurance site. So, how would local rank be applied in this case? It would seem very few peer sites would be linking to each other.
[edited by: AndyA at 3:39 pm (utc) on Feb. 5, 2007]
|even with a threshold of some kind that says "if your this far off the norm or further, we'll throw out your initial rank altogether. |
Thatís an interesting thought, you tend to think about Google and ranking as sort of a continuum; the most relevant first, then each succeeding site ordered upon how relevant the algorithm thinks they are to the query (or attempts to anyway).
Here you may have a situation where if your score adds up in a certain way, thereís a cut off and you get discounted to the very end; doesnít matter how close you came to the cut off, the results the same; end of the line.
The question is why? Whatís the point of doing it that way? We all are seeing sites that are being sent to the very end, that would in fact be more relevant, and of greater benefit to the searcher, than a minimum of 7 or 8 hundred other sites. Iím appreciative of how Google goes about their business, and certainly through the long haul the results have been impressive, but, at least to me, this approach has caused a real decline in the quality of the rankings.
Perhaps thereís more to come but its beginning to look like this is a permanent part of the ranking process.
Randle, what you are saying really fits Marcia's idea that the new Google patent the uses phrases to discern spam might well be involved in this.
I know I've found one phrase that is in several though not all of my missing pages.
If it is indeed this I think there is a thin line between whether the page is penalized or not. Looking over my pages it seems to be more damaging when the phrase is in the page title, H1 title and/or internal anchor link. In fact one page that is doing well has the phrase more times in it than any of the penalized pages but the phrase is not in any of those places I listed above.
Based on the old Magellan website's drop to the Google 950, for many of its terms including old terms that it held for years...
My rationale is that there may be sites that rank well for many terms including terms that are not accurate.
Since Google's algorithm is far from perfect, they are most likely inclined to hide a site that should not rank for even one single inaccurate term, so they penalize the site for multiple to all terms and drop it to the bottom of the results.
That's one theory.
If the sites here all ranked for extremely competitive terms, and a few that it made no sense to rank for, then this is one possibility.
I had someone I trust take a look at my site that's been hit by the 950 Penalty. Other than the things I already knew about, they didn't have a lot to offer. They said I needed to get rid of the extraneous HTML code on many of the pages that dates back to the days when the site was originally built in 2000 with a WYSIWYG editor.
The site has a lot of unnecessary font tags, and I have been able to reduce the size of the HTML on some pages from 41kb to 12-13kb (or less) just by cleaning up the HTML, which gives you an idea of how bad some of the pages are. I'm going to go back and retrofit css on these pages as well, which should reduce the sizes even further, but they do not feel that alone is causing the issue, as other code heavy sites still rank well.
They advised me to add a paragraph or two to my home page telling what the site is about. I have bullet points there now, but they feel a bit more spider food on this important page will be helpful. I think someone else mentioned this in an earlier post here as well. So, I plan on doing that.
They suggested reducing the number of keywords and keyphrases, even though the pages that have been hit read well, and don't seem to use those words too often. They said a human reviewing it would likely not think I had too many uses of the words, but to an algorhythm it could raise some flags.
About 200 pages were found with no meta descriptions, although they did have unique page titles. I knew about those, and am adding the descriptions as fast as possible.
Overall, they said since my site does have some #1 SERPs in Google, and some of them for fairly competitive terms, and since it is doing well in MSN, Yahoo, Ask, etc., that tells them it's close to being what it needs to be and the site as a whole isn't being heavily penalized. They fell short in offering anything specific as to why exactly some of the important pages are at the bottom of the SERPs, other than to advise me to continue cleaning up the site and slowly adding new content to it.
We discussed the Matt Cutts post on his blog where he indicated without actually saying so in so many words that the bottom of the results thing was due to multiple minor issues on the site, which impact its overall value to searchers, such as pages that aren't finished that don't provide the information that they should, lack of unique content, and offering nothing special, just a repeat of info that lots of other sites already offer.
So, without anything new and specific to check, I'm going to just keep doing what I've been doing.
I get the feeling that with the new links tab in Webmaster Tools, Google has diverted attention somewhat from some of the issues (like this one) with its SERPs. It's kind of like, "Look! Over there! A bright shiny thing. See?"
At any rate, I would sure appreciate any additional information or ideas on the 950 Penalty, as it appears to me some sites are getting out of it without making any changes whatsoever, and I am still seeing new sites added every day, as well as sites like mine that appear to just have pages stuck there. This is apparently something we're going to have to live with going forward, unless there is a problem with Google's algo and it's taking them a while to roll out a fix.
If I use a search term, "Model 123 Acme Widget", that page comes up at the bottom of the results in the SERPs. But if I add an additional word to it, i.e., "Model 123 Acme Widget Limited" the page comes up at #1 with an indented #2 going to a related page on my site, and a "More results from..." link underneath.
So, if this is indeed a 950 Penalty, why in the heck would adding one more word to the search remove the penalty completely?
I know there must be something to learn from this, but I sure don't get it.
It would suggest that the page is being penalised with respect to specific phrases only, not a general penalty. That would imply that the problem is specific to the way the page uses the phrase. This could be either the page content or backlinks.
I see the same thing for my pages: "widget" is penalised, but the page lists number-one for "widget reviews".
Hereís my update. Last night I found a whole subdirectory on my site is gone. Itís is a section that has never done well so it didnít affect my overall visitors or earnings. But now Iím wondering if it has somehow poisoned the very popular pages on my site that are now missing. Iíve found a link to the famous war I mentioned earlier and to a country's widgets. All I can think of is there is a commonality here to phrases heavily used by scrapers.
I must confess I did find two pages that had identical meta descriptions to another page in that section. But I canít understand how this would have affected so much of my site.
|why in the heck would adding one more word to the search remove the penalty |
Weíve been talking about penalized pages but I think itís more a matter of penalized phrases. Or that certain phrases or combinations of phrases affect it somehow.
With some of my missing pages Iíve been able to get to come up for other search phrases related to the article taht puts the page back in the top 10. Phrases that are other than the page's main topic. I was hoping that would help me figure out what Iíve done wrong on the missing pages but I canít make any sense out of it.
Weíve been talking about penalized pages but I think itís more a matter of penalized phrases
That's exactly what I'm seeing. A combination of having half of my site removed from the index and 90% of the rest put in the SERPs.
The bottom line is just about every time I was *ranked* for is now end-of-results, even if there was no link anchor or optimization for those words. It's like Google took a set of related words that it probably penalized me for and dumped me for the whole lot of them.
I mentioned in the other thread, but no one has said anything about lack of Googlebot activity. I'm seeing it barely crawling my site now with just a few visits/pages accessed every day... and declining.
If a site has a couple thousand pages and only a couple dozen outgoing links, are you saying that increasing the number of outgoing links might actually improve the rank in Google?
I never thought that too high of a percentage of internal links could hurt a website's ranking, but maybe it's possible.
Lots of factors involved here, macman23, but essentially yes - in my experience outbound links to good on-topic sites can actually help ranking.
I wanted to point our that the discussion in our February 2007 Google SERP Changes [webmasterworld.com] thread no has several reports of what sounds like this "950 penalty", but the drop in ranking is less severe - only 120, or 300, or 400 etc. everything else sounds the same - keyword related, previously top ranked url, etc.
I think this reinforces the idea of a "last step re-calculation of ranking" being involved. That theory didn't make as much sense if every recalculated ranking was thrown to the end of results.
One thing I just discovered last night on my site. I had uploaded a classfieds script on my site a looong time ago (i dont even remember when ) but I installed it, didn't like it, and just left it there.
No links to it, nothing.
Well I just found that some scraper sites ahd linked to it, and there were actually some 'dubious' posts. Not many, like 5 at most but still.
I deleted it now, but perhaps its one more of those 'minor' things that Cutts was talking about.
I've gone through my site looking for other junk things that were uploaded and never used, and removed a couple.
Something for others to consider..Check any old scripts / uploads! THey may be indexed..
I know I posted here although some results I spoke of were buried deep in the rankings, but not really -950. It just seemed the most relevant thread. Probably others did same...
My site is actually 9since an hour0 coming out from the 950 issue.
Are someone shows the same results?
Is there a list anywhere of spammy phrases? I don't see anything spammy on the pages that have been hit, I even copied the text of two of them into an E-mail and sent it to myself with filters set on high meltdown, and nothing tripped them.
I know words like "free," "as seen on TV," etc., can be a problem but I have nothing like that on my site.
I have been going over it pretty thoroughly cleaning up very minor things, but I haven't found anything even remotely serious so far.
< discussion continues here: [webmasterworld.com...] >
[edited by: tedster at 9:09 am (utc) on Feb. 8, 2007]
| This 60 message thread spans 2 pages: < < 60 ( 1  ) |