Welcome to WebmasterWorld Guest from 184.108.40.206
I've been so quiet because MHes has said most of the things I would have said anyway.
Mind me, part of this is theory. I see instances, experience certain patterns and behaviours, then analyze what's in front of me. And the end result is what I call SEO. For the rest of the day at least.
Some points to remember...
As Martin mentioned, the anchor text of links from trusted/locally trusted sites is what decides 98% of what's in the SERPs. Title and body text are criteria to be relevant/filtered, but are thus binary factors. If present, and are matching the incoming anchor, or even the theme of the anchor, the page will rank. Meta is optional.
Title and the content text have two characteristics that are connected to this problem.
One being, that every single word, and monitored phrase gets a scrore. 7 word phrases are not monitored. Monitoring is probably decided based on search volume and advertiser competition, ie. MONEY. So there's no infinite number of them.
Second is, should the page gather enough votes from inbounds / trust or localrank through its navigation for any single word/watched phrase, it passes a threshold that will decide the broad relevance of the page. The page could be relevant for more than one theme. It could be relevant for "Blue Cheese" and "Blue Widgets" if it gets inbounds for both themes. ( Note I'm over simplyfying things, relevance is calculated long before that. ) If it's relevant for "Cheese" Google knows it's probably about "food".
The theme of the page now will make it rank better for certain queries. These aren't necessarilly semantically related. A site that ranks #1 for "Blue Cheese" may rank relatively better for "Azure Cheese" than before, even though this phrase in nowhere in the anchors or titles, and only appears in parts of the content.
If you cross a certain line of on-page factors, another theme might be evident to you, based on the title/content. But if the page does not have any support for that theme in the incoming anchor text, this may be viewed as trying to game the system if Google doesn't understand the relation. "Blue Cheese" IS relevant to "Kitchen Equipment" to some degree. Google might not know this.
Another, blunt example is mixing up "thematic relevancy" with "semantic relevancy", when your "Blue Chese" page starts to have an excessive number of instances of blue things, like "Blue Widgets", "Blue Hotels". Google will think that this is because you have noticed you can rank well for Blue. And tried to add a couple of money terms that are semantically relevant. But what AdWords, Overture or Trends, or in fact Google Search does not show... is that the algo now knows these things are not related.
Question is... to what degree is this filter programmed.
1. If you have N number of kinds of phrases on a page that are only semantically relevant ( ie. as "blue cheese" is relevant to "blue widget" ), and you don't have support for both, your site gets busted. If popular phrases, that you know to be thematically relevant to your page, aren't in the Google database as so, you're busted. Based on the previously mentioned problem, if you have a website that's relevant for modeling, and add internal links with names of wars all over, Google may not find the connection.
2. If you do a search on AdWords for "Blue", you'll get a mostly semantically relevant list of keyphrases that include/are synonyms/include synonims/related to "blue". A human can identify the "sets" within these phrases and subdivide the list into themes. Spam does not do this, or so Google engineers thought.
3. So there are subsets in the hands of Google that further specify which word is related to which. These are themes. You'll see sites rank for synonyms within these sets if they're strong enough on a theme, even without anchor text strenthening the relevance. A site that's #1 for "Blue" might rank #9 for "Azure" without even trying too hard.
4. If you have a site about "Cheese", you can have "Blue Cheese" and even "Blue Cheddar" in the navigation, titles, text, for they are included in the same subset. You can't have "Blue Widgets" on the "Blue Cheese" page.
5. What constitutes these sets? Who decides on themes and based on what? What is the N number of "mistakes", how well determined are these?
But then, so are the SERPs right now. There's at least 4 different kind of ranking I see in the past 3 days.
So far I've only seen instances of filtered pages when 5 to 6 themes collided all at once. Quite easy to do by chance if you have completely legit "partners" or "portfolio" page with descriptions, and/or outbound text links. But only a single theme that's supoorted with the navigation/inbounds, and only if there is a decided theme for the page. If there's no theme ( navigation and - lack of - inbounds doesn't strengthen either ) I'd say Google passes on penalizing.
As for the themes, I was thinking perhaps Google went back to the good old directory age, and started from there. Remember how you started with the broad relenacy, then narrowed it down to a theme, then an even closer match? With cross references where applicable.
This isn't new. Penalties that are based on it are.
If there is such a penalty it is by these lines.
[edited by: tedster at 9:16 pm (utc) on Feb. 27, 2008]
I'm wondering if I should do a nofollow on those pages. No one is finding them through Google anyway and I don't want them to poison other pages. Or am I being irrational.
Strange thing is I got the content's page back some time ago and it's still back.
Thinking about it, it could be that one of my competitors stole my data and republish it on their site (mine has been up for three years but apparently Google doesn't care).
These guys at Google should at least give a hint as to what are their criteria. The SERPS are messed up big time and sites that copy content are up.
Question to those hit with -950: Did the pages/sites that were hit have inbound links from Wiki?
Not me. Perhaps the Wiki webmasters knew more than we did, to nofollow their OBLs, or is it a coincidence?
By the way, I do doubt that Wiki's nofollow move has already taken effect in the SERPs.
[edited by: Martin40 at 6:26 pm (utc) on Mar. 27, 2007]
It's all a big brain battle.
The webmaster community may not be an army of PhDs but sure is full of experts in certain areas. No better way to learn the tricks of trade than making a living out of it.
I'd imagine that some of the filters were actually thought up in threads of webmasters discussing other filters.
But even so, I still like to think by the line of Google and the rest of the web living in a symbiosis. And that at least a minor point to us having these discussions, of which at least half of are conspiracy theories, is to keep this system balanced if it's rocked by an algo change / new filter.
Let's not forget that it's in everyone's interest that webmasters produce / manage good websites, and Google reports relevant results.
The efforts of the two sides even each other out in this natural process. No ill intentions, just plain old survival instincts.
And if someone knows the kind of postings I make here they'll know I'm not taking sides. I simply want to keep on advancing forward.
I'm sure Google realizes this is occurring, but for some as yet undetermined reason, they feel the benefit is worth the cost. I don't see nor understand what benefit to the serps this new phenomenon has produced.
This problem strikes at the heart of WebmasterWorld. It's what WebmasterWorld was originally all about --how search engines rank pages and WHY.
Here's what I did.
1. removed adsense from the 100 new pages that I had added to the site two weeks before the -950 kicked in.
2. used a grapic rather than a text link to link back to my homepage.
3. aquired two high PR links (PR7 and PR8) back to the site.
Of course, it could have nothing to do with what I did. Who knows.
It's way too mind-boggling to be guessing all the time.
Remember, if google catches you in a link scheme of some sort or they see unnatural linking, they are going to smack you hard.
First the -30 penalty, now the -950 penalty which may even be harsher....
When I look around the 950 bottom of the barrel, not only do I see some quality sites, but also entire dmoz categories! Last week I even spotted the Google directory verion of a dmoz category down in the dumps...
Oddly for terms I watch, the Yahoo directory link stays fairly stable in the SERPs.
I agree. You have to take a look at how Google smacks a site. We have seen -30 and -950 penalties. Most of the time, when people find themselves there Google is viewing them in one of three ways.
1. The site breached trust, may be via links.
2. I have to love it when someone says they are an "authority" site with thousands of pages, then all the sudden you dig deeper in the content and find a bunch of copied and pasted content from other sites. Google can easily spot this. If you do this and Google catches you, well then there is a trust breach there again.
3. Google looks at the traffic and metrics of the site, and the algo does not look at the site as "value added" to the web. Google can monitor bounce rates and if a site has analytics or ad sense on it, Google will know every time a page is viewed and will be able to gain detailed information on it. This would fall under the MSSA penalty which I believe is alive and well. Remember, just because the webmaster thinks the site is really good does not mean users might think the same.
I would have to say, most sites that have been hit, more than likely breached their trust rank, which can take forever to gain back.
1) Does the 950 penalty seem to only apply to ecommerce websites? Or a variety of websites?
We are an ecommerce website.
2) If so - do you use a shopping cart system from a (and that links to) an external different website?
We do. Would being the owner of both sites matter?
3) In your navigation links for your website - is a certain term used more frequently?
Example: www.widgets.com - May have whole areas dedicated to different types of widgets. The navigation would link to these areas and pages. Useful navigation would be something like Blue Widgets, Green Widgets, Custom Widgets, etc.
Could google then think that because Widgets shows up 300 times the website must be some sort of spam site?
This theory doesn't completely work for us - because we have pages at #1 that use widgets in the title and links, but other pages that are -950 and don't have widgets in / on them. (Except of course the navigation is the same on all the pages!)
4) Is there anyone to contact at Google to have them check our website? I tried calling but got nowhere. Just an automated response saying they do not do live customer support for search results and to check out the FAQ on their site.
Can anyone send me the email of someone who works at Google?
That's the point. Obviously they know what they are doing is working very badly. I would say almost as obviously, if they stop doing what they are doing to save the good sites, the results would be (even more pathetic) than they are now.
This has happened many times in Google's history. They do something foolish on its face, but they decide the alternative is worse.
Until they become a better search engine, we are stuck with "damned if you do, damned if you don't" lameness like this.
This is one of the strangest effects Google has generated to date. After all these months, it doesn't look like it's going away, either. One of the main reasons I would like to understand it better is that I think it has implications for ranking well in today's algo, in addition to not being "penalized".
3. aquired two high PR links (PR7 and PR8) back to the site.
A strong page with good quality inbound links can overcome a combination of words or phrases that would otherwise bring on the filter. Sometimes it just takes one good link.
I'm convinced this is related to the Phrase Based anti spam patents and some of our pages are just getting caught in it because we have something about our navigation or wording that is similar to something that spam sites have.
This is one of the strangest effects Google has generated to date.
And this is also the first time I can ever recall where the WebmasterWorld braintrust has been totally stumped as to the PURPOSE of a penalty/phenomenon.
I'd say it was probably this one
3. aquired two high PR links (PR7 and PR8) back to the site.
Wouldn't that be a hoot if the only way to break the -950 was to buy links. How ironic.
- It's a penalty for buying links
Yes, if Google has been on a hunt for these, and devalued a lot of them. Devalued is the key. Rocks the tight balance of the linking profile, perhaps makes a site lose PR or trust, or just sets back the anchor text pattern to be repetitive, excessive, or downright exclude one of the phrases you've never thought to be important to your theme.
- It's a penalty for thin/lame/errorous sites
Yes, if some of the pages have been dropped from the index, breaking important internal navigation hubs that let trust, PageRank and relevancy spread throughout the site. Duplicate content, a falsely set title tag, HTML errors, you could name whatever to be the cause but these are not the direct reasons for this filter being applied, we know that. If your site is messed up, its consistency will break, and with it all the well planned funnels will break apart too, resulting in a fragmented relevance matrix that simply doesn't make sense to Google anymore.
- It's for shaking up the SERPs to bring new URLs to the surface
Yes, as in everything Google does to its algo is for this sole purpose.
- It's for Google being able to monetize on AdSense / AdWords
Yes, sure. I'm not saying they're not trying to do so, all I'm saying is, they wouldn't want to get rid of all the quality sites so the Google-junkie public would need to click on AdSense ads to find anything. For that would only work for about a year before they'd realize it's not the good old Google anymore, and switch. Do you think that the people who use Google as if it was a directory, and for years never ever bookmarked a single page don't suffer from the missing of those high value sites? These people are LOST! I've been hearing them whining "but it was there yesterday" so many times the last year I started to re-educate them on how to add a site to their "favourites".
- Google messed up, and does nothing about it
... our sites, even the most well known one, get about 30-40% of their traffic from Google. I can't afford not to adapt. How about you?
- Google messed up, and only has as much resources to care for popular searches
Yes. See the above answer.
- We've recovered and we did nothing!
Yes, but Google did. They may have modified the thresholds, they may have added / removed phrases from themes. Good for you, and wishing for your sites to stay that way.
This thread, as any thread here, wasn't only about the -950 penalty.
It was about staying properly indexed.
And apart of results being localized with ever more re-ranking and Trust being even more closely tied to relevancy, I see only one major change that occured, which is...
Phrase-based filtering and re-ranking.
Which filter is a little overboard, but until its fixed, we have to deal with it.
If your site does OK for a theme, and you aim hard for something that's not within that theme ( by Google - and AdWords - standards ), you get zapped. Then you can either back out of the alley by modifying your navigation / titles to say, "Oh I didn't mean to, actually I know I'm not the most relevant for this" or rush ahead, get some quality links with the "new" theme / phrases and support your site's relevancy. Google has no other means to tell whether a site is fit for example, a 10.000 query/month two word money phrase than to look at its off-page factors, and match it up to what's actually on the page. And yes, a site that has links to it describing it to be something else, has a potential to be a latent result for the term, once you add it to its title/body. Don't tell me you didn't know this, for that's how Google works. And it works backwards as well. It's becoming more and more complicated that's all.
( on-page relevancy, internal navigation relevancy, inbound link relevancy, link source relevancy, phrase based filtering, excessive repetitive anchor text filtering, broken navigation because of duplicate content, trustrank being lost, excessive crosslinking, sitewides being devalued, phrase-sets for a theme being messed up either on your site or in Google, no proper funnels, high bounce rates, no clickthroughs... you have to pay attention to everything. )
It'd be great if people who have seen these tactics bring out sites from the mud could chime in and tell their story ( if there are any such stories ). In fact some of those who were most active for the first couple of hundred messages are off to other topics. I assume they either succeeded or gave up. But there's a big difference between the two.
Let's keep the discussion going, SEs evolve fast, we need too. Everything I say today might be obsolete within a month.
[edited by: Miamacs at 11:31 am (utc) on April 4, 2007]
Edit: It seems quite clear to me that if I search using the adwords keywords tool on the root phrase, that the page(s) would match a high percentage of these terms. That's not because they are spammy, it's because they're quality and cover the topic in an in-depth way.
[edited by: Nick0r at 11:50 am (utc) on April 4, 2007]