Forum Moderators: Robert Charlton & goodroi
I've been so quiet because MHes has said most of the things I would have said anyway.
Mind me, part of this is theory. I see instances, experience certain patterns and behaviours, then analyze what's in front of me. And the end result is what I call SEO. For the rest of the day at least.
Some points to remember...
As Martin mentioned, the anchor text of links from trusted/locally trusted sites is what decides 98% of what's in the SERPs. Title and body text are criteria to be relevant/filtered, but are thus binary factors. If present, and are matching the incoming anchor, or even the theme of the anchor, the page will rank. Meta is optional.
Title and the content text have two characteristics that are connected to this problem.
One being, that every single word, and monitored phrase gets a scrore. 7 word phrases are not monitored. Monitoring is probably decided based on search volume and advertiser competition, ie. MONEY. So there's no infinite number of them.
Second is, should the page gather enough votes from inbounds / trust or localrank through its navigation for any single word/watched phrase, it passes a threshold that will decide the broad relevance of the page. The page could be relevant for more than one theme. It could be relevant for "Blue Cheese" and "Blue Widgets" if it gets inbounds for both themes. ( Note I'm over simplyfying things, relevance is calculated long before that. ) If it's relevant for "Cheese" Google knows it's probably about "food".
The theme of the page now will make it rank better for certain queries. These aren't necessarilly semantically related. A site that ranks #1 for "Blue Cheese" may rank relatively better for "Azure Cheese" than before, even though this phrase in nowhere in the anchors or titles, and only appears in parts of the content.
If you cross a certain line of on-page factors, another theme might be evident to you, based on the title/content. But if the page does not have any support for that theme in the incoming anchor text, this may be viewed as trying to game the system if Google doesn't understand the relation. "Blue Cheese" IS relevant to "Kitchen Equipment" to some degree. Google might not know this.
Another, blunt example is mixing up "thematic relevancy" with "semantic relevancy", when your "Blue Chese" page starts to have an excessive number of instances of blue things, like "Blue Widgets", "Blue Hotels". Google will think that this is because you have noticed you can rank well for Blue. And tried to add a couple of money terms that are semantically relevant. But what AdWords, Overture or Trends, or in fact Google Search does not show... is that the algo now knows these things are not related.
Question is... to what degree is this filter programmed.
...
1. If you have N number of kinds of phrases on a page that are only semantically relevant ( ie. as "blue cheese" is relevant to "blue widget" ), and you don't have support for both, your site gets busted. If popular phrases, that you know to be thematically relevant to your page, aren't in the Google database as so, you're busted. Based on the previously mentioned problem, if you have a website that's relevant for modeling, and add internal links with names of wars all over, Google may not find the connection.
2. If you do a search on AdWords for "Blue", you'll get a mostly semantically relevant list of keyphrases that include/are synonyms/include synonims/related to "blue". A human can identify the "sets" within these phrases and subdivide the list into themes. Spam does not do this, or so Google engineers thought.
3. So there are subsets in the hands of Google that further specify which word is related to which. These are themes. You'll see sites rank for synonyms within these sets if they're strong enough on a theme, even without anchor text strenthening the relevance. A site that's #1 for "Blue" might rank #9 for "Azure" without even trying too hard.
4. If you have a site about "Cheese", you can have "Blue Cheese" and even "Blue Cheddar" in the navigation, titles, text, for they are included in the same subset. You can't have "Blue Widgets" on the "Blue Cheese" page.
5. What constitutes these sets? Who decides on themes and based on what? What is the N number of "mistakes", how well determined are these?
But then, so are the SERPs right now. There's at least 4 different kind of ranking I see in the past 3 days.
...
So far I've only seen instances of filtered pages when 5 to 6 themes collided all at once. Quite easy to do by chance if you have completely legit "partners" or "portfolio" page with descriptions, and/or outbound text links. But only a single theme that's supoorted with the navigation/inbounds, and only if there is a decided theme for the page. If there's no theme ( navigation and - lack of - inbounds doesn't strengthen either ) I'd say Google passes on penalizing.
As for the themes, I was thinking perhaps Google went back to the good old directory age, and started from there. Remember how you started with the broad relenacy, then narrowed it down to a theme, then an even closer match? With cross references where applicable.
...
This isn't new. Penalties that are based on it are.
If there is such a penalty it is by these lines.
[edited by: tedster at 9:16 pm (utc) on Feb. 27, 2008]
I've seen it hit quite a few in the industries I...ahem...track. On some of the two word phrases, it has been mixed.
1. Yes on a few (notably 'payday' related...stupid candy bars; I mean, they don't even have any chocolate!)
2. No on most
The clustering very well could be a piece that is involved in the re-ranking if they are trying to balance serps a bit more, but on those affected, my gut feeling is that it seems to be happening due to re-ranking fractional multipliers being incurred more due to co-occurance of keywords and/or [not enough localset inlinks / authority-to-nonauthority inlinks].
Tell you what, as a joke I'm going to include some co-occurance keywords to an alternative meaning to see if affects one of the affected sites (without otherwise modifying the inlinks / co-occurance weighting of existing on-theme phrases).
Cygnus
No, my keywords only mean one thing..
Also it has affected all keywords related to that keyword.
So if the keyword in question is CARS then it also affected FORD, CHEVY, DODGE, TOYOTA and so on.
Also I have noticed I cant even get the site to rank if I type KEYWORD DOMAIN.com and do a search. Still buried or not ranking well.
Now there is another section of the site that is bouncing around but still ranking well that is not related.
Now if I do a search for a keyword from the good section and add a keyword to the bad section I can sometimes get rank.
I hope that makes sense. I wish we could just use the keywords sometimes.
I'm starting to think Google is trying to separate themes.
Like lets say you have a site about shoes and boots and Google decided shoes and boots don't go together you will just pull rank for one or the other but not both. This is what I'm starting to see.
I know this blows away wiki and shopping portal sites like amazon but I think they are getting away with it because they are so trusted.
Who knows but I hope it can be cracked soon.
Do any other of the problem searches people are seeing include a word with a potentially ambiguous meaning?
No, in fact I see the opposite. One section of my site is related to a word which has an ambiguous meaning. Through all of these data refreshes this is the only section of the site that hasn't been impacted. The other sections of my site without an ambiguous keyword are the sections that get hit. So perhaps ambiguity has something to do with it in a way we don't totally understand.
Could mean
Neon Gas
Neon Light
Dodge Neon
Neon Signs
Remember, people are lazy when they search. They type in one word and see what comes up, if they do not like what they see they use that back button and type in two keywords. So if Google figures out that 95% of the people who are typing in keyword “neon” are actually looking for “dodge neon” they will change the serps for keyword “neon” so users can find things easier. If you want a good example look at the top 100 results for keyword “neon” and you will see a wide variety of things.
So look at the ways that one keyword can be interpreted into two or three keywords.
Another Example “Bathroom”
Could mean
Bath tub
Bath towels
Toilet
Etc…
See where this is going?
Remember, google builds its engine around users and the keywords they type in.
One member here emailed me. His issue was due to spammy looking subdomain internal links and duplicate content. Once he fixed it with 301 redirects and dropping his subdomains, he bounced right back and saw immediate improvement.
I have yet to find a site where linking was not causing this issue.
example:
is it bad to have the same keyword in every link back to the home page
Is it bad to have different keywords in links back to the home page
or what could be a issue, I dont think its the links when I look at the different sites in the serps.
In my case i dont have any subdomains.
Is it bad to have different keywords in links back to the home page? No
Is it bad to have "boiler plate" links? (Go back and read this thread) Yes, I remember Adam Mentioning and tedster talking about it [webmasterworld.com...]
Is un-natural link growth bad? Yes
Is a link exchange bad? Yes, one bad neighborhood can ruin your good neighborhood
Paid text links bad? Yes, Google can spot some of them
Affiliate links bad? Oh man, affiliates took a beating the past few months.
Is it bad to have a link from a “Search Engine Optimization” SEO site? Can be, especially if the SEO are using link Schemes
Are real long URL’s bad? Yes, they can put you in the supplemental index.
Is Google cracking down on sub domain spam? Yes, be very careful to make sure your sub domains do not have spammy looking links.
Are these things new? No, we have all been given fair warning. Typically Matt or Adam give us a heads up changes are going to happen and write about them three to six months in advance. They then make changes in the algo; it gets tested and goes through an approval process on a data center typically 2-6 months later it gets implemented. The implementation is subtle since Google slowly crawls the web.
Last but not least, have back links been devalued across the web? Yes, Wiki links have been devalued due to the no follow. Imagine the ripple effect. So, if Wiki linked to an External Page that linked to your site, you all the sudden lost some link power and probably even lost some trust rank.
Now, how to pull out of the penalty if you are caught in it:
1.If you have a link exchange page or are in a link exchange, drop it and sit back and wait. This can take some time to fix because typically link exchange pages are in the supplemental index and it takes a while to re-crawl.
2.Make sure your duplicate content is clean and you do proper 301 redirects. No www and non www issues.
3.Look very hard at your external links. Look via Alta Vista because it seems to show links the best. If you see a bad neighbor hood email the webmaster and ask them to take the link down. If you did not give permission for the link to be placed you can also file a DMCA complaint if they used your content or URL since it is yours. Yes, you can file DMCA complaints for links; think about it, if I took Pepsi’s Logo and set up a drink stand and sold my own version of Cola, Pepsi can sue me for the use of their logo. Same thing applies to your URL, it’s yours and you own it, so if you find your URL being used on an advertising site you have full rights to have it removed if you did not give permission.
4.Get one or two “trusted links” with high PR to your site. A few members did that and they pulled out.
5.Read the boiler plate thread and clean things up if necessary.
If a bad site is linking to you should not have any effect for your site.
I do not know about the bad neigborhood statement. Here is a statement right from google guidelines. Its a cut and dry statement. I would venture to say "affiliate links" sometimes get caught up in this as well.
"Don't participate in link schemes designed to increase your site's ranking or PageRank. In particular, avoid links to web spammers or "bad neighborhoods" on the web, as your own ranking may be affected adversely by those links"
If google even thinks the webmaster is trying to game it, you will get hit hard with a penalty.
As far as white hat or black hat, neither really apply anymore.
We should term it as google hat.
My web site has not been caught up in the recent Google fluctuations (except minor ones that were hardly noticeable). It has been online for 9 years (both original and reprinted content) and has only been completely filtered from the index once, in June 2005. It returned in September 2005 with better rankings than ever and hasn't dropped out since until March 7th.
Do any other of the problem searches people are seeing include a word with a potentially ambiguous meaning?
Yes but it's spotty. I've found a couple that can have another meaning and that meaning is highly advertised so could be phrases that are watched by Google.
I don't think across the board changes will solve this. For example it's not a matter of too many internal links with identical anchor text. But if the anchor text has a word or phrase that is similar to what spam sites use then it could hurt.
I read a comment you made on an adsense thread about sending out 30,000 email newsletters a month to subscribers. If those are going to gmail addresses and people are flagging them as spam in gmail, that could very well effect you.
That brings up a good point, how many of you send out weekly or monthly emails such as advertisement or newsletters? I wonder if google is looking at gmail filters to try to figure out who is spamming and who is advertising. If 90% of the people who use gmail and mark emails from your site as spam, you might have an issue. I can see where google might want to use that data.
[edited by: trinorthlighting at 5:00 pm (utc) on April 12, 2007]
[appft1.uspto.gov...]
I was told months ago by a Google rep that my site had been filtered due to backlink issues. Specifically, I was told that it appeared that I was buying links to manipulate PageRank. My site has many high-quality, naturally occuring links from other sites like Wikipedia and many edu sites, and I've never had a reason to buy links. I couldn't believe that I was being filtered due to PR manipulation.
For over a year I've had a spider trap in place which limited the speed at which pages could be viewed. I was very proud of the fact that very little of our content made it to scraper sites.
After being filtered for 6 months I decided to remove the spider trap, and for about a week now the filter has been lifted. I'll repeat, I removed the spider trap, allowing scrapers to get at my content, now the filter has been lifted.
I have a supportable theory as to why this would happen, and if this fits the pattern for other sites I'd be happy to share...
While I may not agree with the basis of some of the re-rank fractional multiplying that seems to occur, the weighting of "something" is just completely off. I just haven't figured out what the "something" is yet. It doesn't seem to be forms of on-page optimization or keyword usage, site age, or even theme of root domain for that matter. Is it links?
I haven't done enough testing to see how much of a localset for the returned 1000 the top 20 of certain phrases have, but then again, that localset should probably be a lot larger for the related phrases...anyhow, that's the one piece left for me to really burn through.
Otherwise, I have found a few instances where even strong global authority links that wouldn't logically be in the localset weren't enough to counteract what is going on.
So...wanna get rid of a few competitors? It isn't looking too difficult. tisk tisk Google.
Cygnus
After being filtered for 6 months I decided to remove the spider trap, and for about a week now the filter has been lifted. I'll repeat, I removed the spider trap, allowing scrapers to get at my content, now the filter has been lifted.Yep I have spider traps on all my sites and would like to hear the rest of your theory.I have a supportable theory as to why this would happen, and if this fits the pattern for other sites I'd be happy to share...
I'm glad to see you have positive results. From what I see here it will take more time to be sure if it will stick.
I was told months ago by a Google rep that my site had been filtered due to backlink issues.
This is truly worrisome information straight from Google. I've been suspecting all along that scraper pages linked to the 950ed pages might be part of the problem. Even if it is only the fact that they might be linking to the page with a phrase that is flagged with the Google phrase based filter.
I read in another thread that you wrote that you have a recip links page. That is probably what is causing your site some grief. Remember, the algo is completely automated with very little human input. You probably need to take a long hard look at who your linking to and if they are spamming.
Remember, Google guidelines state not to have your site link to bad neighborhoods. If one of the sites you are linking to is spamming Google, it can have a drastic effect on your site. Check to see if all the sites you link to are following Google guidelines. If they are not, you might want to drop that particular link.
< continued here: [webmasterworld.com...] >
[edited by: tedster at 4:29 am (utc) on April 21, 2007]