| This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4  6 ) > > || |
|Google's 950 Penalty - Part 6|
< continued from [webmasterworld.com...] >
< related threads: -950 Quick Summary [webmasterworld.com] -- -950 Part One [webmasterworld.com] >
I've been so quiet because MHes has said most of the things I would have said anyway.
Mind me, part of this is theory. I see instances, experience certain patterns and behaviours, then analyze what's in front of me. And the end result is what I call SEO. For the rest of the day at least.
Some points to remember...
As Martin mentioned, the anchor text of links from trusted/locally trusted sites is what decides 98% of what's in the SERPs. Title and body text are criteria to be relevant/filtered, but are thus binary factors. If present, and are matching the incoming anchor, or even the theme of the anchor, the page will rank. Meta is optional.
Title and the content text have two characteristics that are connected to this problem.
One being, that every single word, and monitored phrase gets a scrore. 7 word phrases are not monitored. Monitoring is probably decided based on search volume and advertiser competition, ie. MONEY. So there's no infinite number of them.
Second is, should the page gather enough votes from inbounds / trust or localrank through its navigation for any single word/watched phrase, it passes a threshold that will decide the broad relevance of the page. The page could be relevant for more than one theme. It could be relevant for "Blue Cheese" and "Blue Widgets" if it gets inbounds for both themes. ( Note I'm over simplyfying things, relevance is calculated long before that. ) If it's relevant for "Cheese" Google knows it's probably about "food".
The theme of the page now will make it rank better for certain queries. These aren't necessarilly semantically related. A site that ranks #1 for "Blue Cheese" may rank relatively better for "Azure Cheese" than before, even though this phrase in nowhere in the anchors or titles, and only appears in parts of the content.
If you cross a certain line of on-page factors, another theme might be evident to you, based on the title/content. But if the page does not have any support for that theme in the incoming anchor text, this may be viewed as trying to game the system if Google doesn't understand the relation. "Blue Cheese" IS relevant to "Kitchen Equipment" to some degree. Google might not know this.
Another, blunt example is mixing up "thematic relevancy" with "semantic relevancy", when your "Blue Chese" page starts to have an excessive number of instances of blue things, like "Blue Widgets", "Blue Hotels". Google will think that this is because you have noticed you can rank well for Blue. And tried to add a couple of money terms that are semantically relevant. But what AdWords, Overture or Trends, or in fact Google Search does not show... is that the algo now knows these things are not related.
Question is... to what degree is this filter programmed.
1. If you have N number of kinds of phrases on a page that are only semantically relevant ( ie. as "blue cheese" is relevant to "blue widget" ), and you don't have support for both, your site gets busted. If popular phrases, that you know to be thematically relevant to your page, aren't in the Google database as so, you're busted. Based on the previously mentioned problem, if you have a website that's relevant for modeling, and add internal links with names of wars all over, Google may not find the connection.
2. If you do a search on AdWords for "Blue", you'll get a mostly semantically relevant list of keyphrases that include/are synonyms/include synonims/related to "blue". A human can identify the "sets" within these phrases and subdivide the list into themes. Spam does not do this, or so Google engineers thought.
3. So there are subsets in the hands of Google that further specify which word is related to which. These are themes. You'll see sites rank for synonyms within these sets if they're strong enough on a theme, even without anchor text strenthening the relevance. A site that's #1 for "Blue" might rank #9 for "Azure" without even trying too hard.
4. If you have a site about "Cheese", you can have "Blue Cheese" and even "Blue Cheddar" in the navigation, titles, text, for they are included in the same subset. You can't have "Blue Widgets" on the "Blue Cheese" page.
5. What constitutes these sets? Who decides on themes and based on what? What is the N number of "mistakes", how well determined are these?
But then, so are the SERPs right now. There's at least 4 different kind of ranking I see in the past 3 days.
So far I've only seen instances of filtered pages when 5 to 6 themes collided all at once. Quite easy to do by chance if you have completely legit "partners" or "portfolio" page with descriptions, and/or outbound text links. But only a single theme that's supoorted with the navigation/inbounds, and only if there is a decided theme for the page. If there's no theme ( navigation and - lack of - inbounds doesn't strengthen either ) I'd say Google passes on penalizing.
As for the themes, I was thinking perhaps Google went back to the good old directory age, and started from there. Remember how you started with the broad relenacy, then narrowed it down to a theme, then an even closer match? With cross references where applicable.
This isn't new. Penalties that are based on it are.
If there is such a penalty it is by these lines.
[edited by: tedster at 9:16 pm (utc) on Feb. 27, 2008]
|Phrase-based filtering and re-ranking. |
Which filter is a little overboard, but until its fixed, we have to deal with it.
I think that pretty much sums it up.
|people who have seen these tactics bring out sites from the mud could chime in |
It takes a lot of trial and error. For me new links have helped at times, reducing repetition in anchor text has helped in some cases, removing words or phrases that I suspect might have caused the problem seems to have also helped.
Some pages I've just given up on as they were never high traffic anyway. And a couple I'd really like to get back but can't come up with a solution.
I've never lost a whole site but have lost topic sections and a mostly I've lost scattered pages that seem unrelated.
When you do the process to find your 950ed pages using the repeat with omitted results notice what else is clustered with the page. Usually if not always the other pages in the cluster are also 950ed so I guess they aren't really unrelated.
BTW is anyone else finding any sort of review seems to be 950ed? I don't do many but most all my review pages are gone.
My nuked pages are also reviews. That makes sense because they are generally more lengthy and contain more phrases.
|all my review pages are gone / nuked pages are also reviews |
Were they reviews written as an article by a reviewer, or "user reviews" where several users post comments?
Hand written reviews (articles) of products. Not user reviews.
[edited by: Nick0r at 3:52 pm (utc) on April 4, 2007]
|I would have to say, most sites that have been hit, more than likely breached their trust rank |
Well I would say most sites that got hit more than likely didn’t do anything wrong. Plenty of sites, including some we had get hit, changed nothing and came back to where they were. Every day you see sites ranking in the 900’s, that just don’t belong there by any stretch of the imagination.
|This is one of the strangest effects Google has generated to date. |
It sure is and eventually you get to a point where the tail starts wagging the dog. If this was happening on MSN no one would be making all sorts of drastic changes to sites that have absolutely no business ranking in the 900’s. They would put the blame where it belongs on the search engine. Right now Google’s acting like the prettiest girl in the school and everyone’s running around making fools of them selves trying to get a date with her.
If your sites been hit by this thing don’t suddenly get “there’s something terribly wrong with the site I’ve spent 8 years building” syndrome. Is it possible, there is in fact nothing “wrong” with your site? Is it possible that what’s wrong is the Google algorithm placed a perfectly good site in position 950 thus short changing all the people that were searching and would have benefited from your site, (and not making available the income you may have derived from those visitors)?
There has been an absolutely bizarre laundry list of ideas put forth on what’s causing this. Google is a tremendous search engine, best there is for sure, but it is not infallible. It is in fact prone to many short comings, mistakes and down right weird looking results. Whose fault is that?
|I would like to understand it better is that I think it has implications for ranking well in today's algo, in addition to not being "penalized". |
Excellent point, there is a broader picture here, fundamental changes are a foot. But, seeing good sites sitting in position 984, is not the end result of whatever it is.
The best approach we have seen, if you do in fact have a decent site, is cutting down on key word rich text links on your home page, continue to build (not recklessly pulling it apart), working hard on your paid advertising campaigns, and finding a good dose of patience.
An excellent post, randle. I agree that fundamental changes are afoot, and the current mosh pit of results will, we certainly hope, not be the final situation. It certainly has gone on a long time, however.
I also agree that trustrank does not appear to be the issue in the "end of results" situations that I've looked at.
However it does appears to be involved in the "minus thirty" phenomenon, where every url from the domain gets depressed about three pages for every search, including a search on just example.com. It's important not to confuse these two phenomena in our discussions.
|Were they reviews written as an article by a reviewer |
Mine were reviews that I wrote over a few years including new ones. Book publishers keep wanting to send me stuff and I figure I get free books right on my topic. But I might start turning them down. No use taking time to write articles that few people will never see.
Has this subject been discussed/noticed elsewhere by the non-webmaster community (digg, etc)? It would be nice to say to the world: hey, if you can't find what you're looking for in Google, try the last page of the results. Maybe that way Google would acknowledge something.
I'm having this problem (-950 penalty or whatever) since the beginning of one of my sites 1,5 month ago.
The site is hosted in blogger, non-english, theme based, very complete and thorough, fresh content 2 days a week, no spam, no e-commerce, no link selling/buying, no reciprocal linking, no SEO whatsoever, just some natural keyword density.
Backlinks are natural, not forced or agreed upon.
Due to the nature of blogger plataform there's a permanent link to one simple secondary site with some relevant files for download - no duplicated content.
I've established a pattern: 4/5 days front row, 2/3 days back row.
Plenty of information - visitors get what they want, lots of good visitor appraisal, yet, can't deliver that information if the site isn't showed to whoever searchs for the information. Curious is the fact that when the site is in the back row, analysing the search engine keywords for those days, I see numerous "blogspot + title of the blog" searchs... My visitors know the site exists, don't bookmark it and expect to find it quickly in Google.
I'm tempted to put a permanent notice warning visitors about this Google behaviour.
I haven't been changing anything at all. Just blogging. As this isn't my main source of income (it has some adsense but with low return), I can do that and don't freak out.
Yet, as I've read here before, it's rather frustrating writing usefull stuff when I don't know if it's going to be read...
Maybe I should start writing on a paper diary, that way I'm sure it won't be read. :)
Its obviously a penalty of some sort, and its one of the most severe.
Look at the pages, look at the links and you will see webmaster guideline violations or pages that probally have very high bounce rates. Either way, google does not like the pages and the content.
My advice and others who have been in this penalties advice. Redesign your pages, put some fresh and UNIQUE content on them and they will come out.
I have never criticized anyone on a message board before and I've been on line since 1994. But I can't take it any more.
Your messages must make you feel so superior to everyone in this thread. That is the only reason I can imagine you continue to come in here and message since you are obviously too perfect for such a thing to ever happen to you.
We are sincerely trying to solve a problem that has hit many sites that have done nothing wrong. Give us a break.
|Its obviously a penalty of some sort, and its one of the most severe. |
I can't agree that it's *always* a penalty of some sort. It may be in some cases, but there are other reasons that pages can get filtered - NOT just because of SPAM.
|Look at the pages, look at the links and you will see webmaster guideline violations or pages that probally have very high bounce rates. Either way, google does not like the pages and the content. |
It is most unfortunate that some people get sanctimoniously self-righteous when certain phenomena don't happen to hit their site. That does NOT mean their sites are not spammy - and it surely does NOT mean that all sites that have pages filtered down into the 900's ARE spam and violating guidelines.
NO WAY, not by any means. There are plenty of sites with pages in the 900's that have top-notch rankings on other parts of the site, including the homepage, but yet have clusters of pages in the 900's, 400's, 600's.
Notice the word "clusters." Think about it. Meditate on it. Study it. Also think about it before passing judgment on all 900'd sites as spam. Because they are not!
<<I also agree that trustrank does not appear to be the issue in the "end of results" situations that I've looked at>>
When this penalty hit, "lost trust" was the first thing I thought of. It seemed like the obvious answer. Google used to love you, now they don't. Simple as that. However...
These are the "Query Stats" from Google Webmaster Tools for [cough] a friend's main site...
(Note: "mycityname" is one of the largest tourist destinations in the world.)
1. mycityname hotels 1
2. mycity-name hotels 1
3. hotels in mycityname 1
4. mycityname clubs 1
5. hotels in mycityname mycity-name 1
6. mycityname mycity-name hotels 1
7. mycityname 5
8. hotels mycityname 1
9. hotels on mycityname 1
10. mycityname mycity-name hotels 1
11. hotels mycity-name 1
12. restaurant mycityname 1
13. mycityname shopping 1
14. hotels mycityname mycity-name 1
15. mycityname clubs 1
16. mycityname hotels mycity-name 1
17. mycityname2 hotels 1
18. hotels on mycity-name mycityname 1
19. mycityname searchterm 1
20. mycityname hotels mycity-name 1
Folks, that cannot POSSIBLY be the results for a site that has "lost trust." Yet 80% of the previously #1 ranking "mycityname long-tail" searches for this site are at 950.
Is it reasonable for Google to say "We trust you enough to put you at #1 for 10-to-20 of the most competitively spammed-out search phrases on the net, BUT we don't trust you for long-tails like a [mycityname parking on Main street] search?
I wonder if this "penalty" is a situation where things ARE exactly what they LOOK LIKE? In other words, could this be some sort of "topical trustrank" where Google switched from a "trusted site" concept (which we must admit has been exploited) to where a site is now trusted only for certain very narrow topics and must prove itself for all others?
That certainly would explain occurrances where a new external link pulled a page out of the fire.
However, where this theory breaks-down is when pages come back on their own with no changes having been made to them.
That fact alone is going to make this problem a tough one to solve, but I'm still not ready to write the whole thing off as "Google sucks," yet.
|I wonder if this "penalty" is a situation where things ARE exactly what they LOOK LIKE? In other words, could this be some sort of "topical trustrank" where Google switched from a "trusted site" concept (which we must admit has been exploited) to where a site is now trusted only for certain very narrow topics and must prove itself for all others? |
Let's forget this theory very, very fast, so we can re-invent it within the next 30 posts. Again.
|However, where this theory breaks-down is when pages come back on their own with no changes having been made to them. |
But why do we need to think that it's only the webmasters who can make a move to resolve this situation?
If results came back "without the webmaster doing anything" that's not the little green men invading the datacenters, it's Google tweaking the filter, or better yet... finetuning their phrase-sets for a given topic so their cute little AI wouldn't filter out such a huge percentage of completely legit sites.
I think you are right with everything, except in doubting that you're right.
|it's Google tweaking the filter, or better yet... fine tuning their phrase-sets for a given topic |
I'm sure that's it. I think they really are doing this to reduce spam sites and not to torture us or drive us to AdWords.
In the past when something was off they could fix it fairly quickly. But this time it is far more complicated.
But I'm not willing to sit around and wait for pages to come back that I've spent weeks on between research, writing and setting up the article.
"Is it reasonable for Google to say "We trust you enough to put you at #1 for 10-to-20 of the most competitively spammed-out search phrases on the net, BUT we don't trust you for long-tails like a [mycityname parking on Main street] search? I wonder if this "penalty" is a situation where things ARE exactly what they LOOK LIKE? In other words, could this be some sort of "topical trustrank" where Google switched from a "trusted site" concept (which we must admit has been exploited) to where a site is now trusted only for certain very narrow topics and must prove itself for all others?"
If Google looked at sites to determine whether they are a hub or an authority, your friend's site would seem like it had been determined to be a hub. Maybe long-tail search terms are expected from authorities. Not saying yours isn't also an authority, but maybe the changes in the results are Google's attempt to discern which of the two a site is.
When I search on long-tail phrases, I'm at the point where I'm sick of searching in a general way, trying to really pin down and get to where I'm going, not keep getting pages that link to other pages. When I'm doing general research on a topic, however, I want to get a page of comprehensive, hand-picked links about it. Are sites typically both things to users? Or are they usually better at being one or the other?
"it's Google tweaking the filter, or better yet... fine tuning their phrase-sets for a given topic"
If only there was evidence of that. That's just hyper-optimistic thinking.
As Matt pointed out before, the data refresh used to take place monthly. It was reasonable to think then that each time they would have a whole different set of parameters that would inflict the penalty, and a whole different batch of ways for it to be mistakenly be applied. With the data refreshes now maybe five times a week, it's extremely unlikely to the point of impossible that they tweak the penalty parameters daily. What would be the point of that? There would not even be enough time to see the effects of the application of the new parameters.
Each daily data refresh is far more likely to only be capable of correcting or making new errors, in combination with all the hummingbird effects of the algo (webmasters tweaking the affected page, tweaking unaffected pages, increase or decrease of scraper links to that page, crawl of an entire data set - phrase-based if you want to look at it that way - etc.).
Pages can get unpenalized by Google changing the basic parameters of the penalty(s), but that should usually not be the reason pages change from penalized to not penalized.
The phrase based filters and the -950 penalty are "runtime".
They're applied direclty to the SERPs.
Meaning they could adjust them every other second, down to any bit of detail, using one setting on one datacenter, and a different on the other. Or they can test different settings with regional Google SERPs. As they do so with the TrustRank thresholds. Now, I'm not saying they DO all the time, but that's pretty much the only reasonable choice when seeing legit sites disappear by the bulk.
The system doesn't need anything else than the original rankings, cached pages, and linking information for relevancy. How fresh the site in the database is, would be a fair question. But we're talking about a lot of pages that were OK up until now. Not in and out, not borderline SEO, no gray tactics.
Not every site is like that, but their status is a fair indication of Google tweaking with the NEW filters, and not the old parameters. If they touched those, ALL sites would go up and down. ( And since it's still a fairly low - below 33% - of the pages on the roller coaster, that's just not it. )
|If only there was evidence of that. That's just hyper-optimistic thinking. |
It's not optimistic, just plain thinking.
I've never said thay we should sit back and wait.
On the countrary.
In fact, the changing of keywords to more relevant ones, fixing inbound and internal navigation ( and anchor text ) are effectively doing GOOD for the site in terms of raising its scores for relevancy. Not only will you keep your pages out of trouble, you might even see them go up a bit. While discussing this issue I've talked about many methods I consider tricks of the trade, which would benefit anyone who applies them, regardless of filters and penalties.
And evidence in SEO...?
The SERPs, recognizing patterns, and some intuition. So far I'm getting by with these.
And the occasional patents for reading.
[edited by: Miamacs at 12:55 am (utc) on April 6, 2007]
"Not every site is like that, but their status is a fair indication of Google tweaking with the NEW filters, and not the old parameters."
First, there is no filter. Talking like that confuses the issue. Pages are not removed; they are penalized to a specific place in the results.
It's not plain thinking to be extremely optimistic, particularly where every bit of evidence points to the contrary. Google is not in control of this penalty.
Everytime they run the ballots through the exact same machine, some more chads fall off, and the votes get counted differently.
Please explain "chads".
Sorry I didn't realize they had it random and out of controll.
That explains everything.
Great Miamacs so Chads like "Muggles" in Harry Potter are we people that doen't understand your "magic". Am I a muggle/chad. I would love to know where I would be placed, you ignored my post and continued another conversation is that because you consider me a "Chad". Its all a bit sad and pathetic.....
"That explains everything."
Actually it does. As long as you resist that, you just keep running in circles.
It's been clear from the beginning that there is enormous amount of collateral damage that is not an intended effect. Those people who insist Google is perfect and always gets the results that they want should wake up and smell the SERPS.
For "chads", search for Florida election 2000 and hanging chads.
I wouldn't say it's random, they have a purpose and a strategy. But it's not really under control either. In fact as the net gets bigger and they algos more complicated I'm not sure it will ever really be under control.
I do think they will try to sort out this problem but there will always be new glitches like this. There will never be that secure feeling again that a site that is doing nothing wrong will always be left untouched.
My main home page just showed back up again today after being down at 950+ for the past 12 days or so.
My observation and my intuition is telling me that for me at least, this is about links. In the past few months, I've lost a couple of good quality links due to other site owners making changes to their websites. I haven't gone about getting any new ones lately, and just before this 950 stuff started, my main home page dropped from position #5 to position #15. And that was about the time I started losing some links. So I think that this one domain of mine is on the edge of some filter or algorithm trigger whereby losing some good external links is changing my ratio of internal to external link reputation. And I think that is why I am fluctuating now between position #15 and #965.
I got hit hard on Dec 16th or 17th, and since then have basically done nothing to the site. Now it is back to ranking well for multiple keywords, while I was in the 950 penalty, it ranked for many less keywords. Looks like the December filter knob has been adjusted.
I have an amateur site about a small village. With a few incoming links it reached top of Google SERPs for the name of the village. 2/3 weeks ago I noticed that it has been hit by 950 penalty/filter or whatever. I removed the 3 external links from its home page and it went back to the top after 2/3 days. Strange some of my other sites have no external links on the home page and are still hit by the 950 tragedy :(
I think this "moved to the last page filte" thing has also a little to do in what category you are in, if you are in a category with a lot of competition, you are likely to get into the filter with the whole site or just internal pages if they a theme based.
So I think they filter you if your site is maybe only 3 years old and there already is a lot of sites in that category and you only get to the top if you have some very important links to you, it almost dossent matter what you have on your page, title,body,h1 or what ever as long your text belongs in that category, you will get moved to the last page.
I have 2 sites in that situation a 3 year old, which internal pages NEVER ranked, but the competition is huge, the front page ranks for everything on page 1 or 2 and it has a PR7.
Then I got another its maybe 5-6 years old it never had troubles rank for everything on page 1 in that category, but then it got hit by the google problem 302 links to my page also hijackers, it did not rank for more then 2 years, now there is happening a little, but only with the front page, all internal is side by side with the abowe mentioned internal pages, means now both sides are hit with this filter.
the "just to much" filter
Nick0r- Is the phrase that's giving you problems a city name, geographic area name or something along those lines?
My site has been online since '99 and it got hit. So I don't think you can generalize your experience about the age of your site Zeus...
Age of the site doesn't seem to offer protection from this "end of results" phenomenon. I've seen it affect an 11 year old domain.
I want to note this synchronicity -- Marcia mentioned earlier that she suspects clustering algorithms and semantic processing of various kinds. She even offered a paper about clustering and ambiguous word sense.
Then trinorthlighting posted about a problem search phrase of his that included the word "neon", which can be both a car and a gas. Do any other of the problem searches people are seeing include a word with a potentially ambiguous meaning? I don't want to get into listing of many keywords here, but a simple yes or no would be helpful.
In saying yes or no, it's a good idea to check your keywords in a dictionary. Simple words often have very different potenetial meanings -- for example, geographical words like "Danish" can pick up meanings related to pastry.
[edited by: tedster at 6:00 pm (utc) on April 11, 2007]
I've seen it hit quite a few in the industries I...ahem...track. On some of the two word phrases, it has been mixed.
1. Yes on a few (notably 'payday' related...stupid candy bars; I mean, they don't even have any chocolate!)
2. No on most
The clustering very well could be a piece that is involved in the re-ranking if they are trying to balance serps a bit more, but on those affected, my gut feeling is that it seems to be happening due to re-ranking fractional multipliers being incurred more due to co-occurance of keywords and/or [not enough localset inlinks / authority-to-nonauthority inlinks].
Tell you what, as a joke I'm going to include some co-occurance keywords to an alternative meaning to see if affects one of the affected sites (without otherwise modifying the inlinks / co-occurance weighting of existing on-theme phrases).
| This 176 message thread spans 6 pages: < < 176 ( 1 2 3 4  6 ) > > |