Forum Moderators: open
Thank you,
Ryan Allis
On November 15, 2003, the SERPs (Search Engine Result Pages) in Google were dramatically altered. Although Google has been known to go through a reshuffling (appropriately named a Google Dance) every 2 months or so, this 'Dance' seems to be more like a drunken Mexican salsa that its usual conservative fox-trot.
Most likely, you will already know if your web site has been affected. You may have seen a significant drop-off in traffic around Nov. 15. Three of my sites have been hit. While one could understand dropping down a few positions, since November 15, the sites that previously held these rankings are nowhere to be found in the top 10,000 rankings. Such radical repositionings have left many mom-and-pop and small businesses devastated and out of luck for the holiday season. With Google controlling approximately 85% of Internet searches, many businesses are finding a need to lay off workers or rapidly cancel inventory orders. This situation deserves a closer look.
What the Early Research is Showing
From what early research shows, it seems that Google has put into place what has been quickly termed in the industry as an 'Over Optimization Penalty' (OOP) that takes into account the incoming link text and the on-site keyword frequency. If too many sites that link to your site use link text containing a word that is repeated more than a certain number of times on your home page, that page will be assessed the penalty and either demoted to oblivion or removed entirely from the rankings. In a sense Google is penalizing sites for being optimized for the search engines--without any forewarning of a change in policy.
Here is what else we know:
- The OOP is keyword specific, not site specific. Google has selected only certain keywords to apply the OOP for.
- Certain highly competitive keywords have lost many of the listings.
How to Know if Your Site Has Been Penalized
There are a few ways to know if your site has been penalized. The first, mentioned earlier, is if you noticed a significant drop in traffic around the 15th of November you've likely been hit. Here are ways to be sure:
1. Go to google.com. Type in any search term you recall being well-ranked for. See you site logs to see which terms you received search engine traffic from. If your site is nowhere to be found it's likely been penalized.
2. Type in the search term you suspect being penalized for, followed by "-dkjsahfdsaf" (or any other similar gibberish, without the quotes). This will remove the OOP and you should see what your results should be.
3. Or, simply go to www.**** to have this automated for you. Just type in the search term and see quickly what the search engine results would be if the OOP was not in effect. This site, put up less than a week ago, has quickly gained in popularity, becoming one of the 5000 most visited web sites on the Internet in a matter of days.
The Basics of SEO Redefined. Should One De-Optimize?
Search engine optimization consultants such as myself have known for years that the basics of SEO are:
- put your target keyword or keyphrase in your title, meta-tags, and alt-tags
- put your target keyword or keyphrase in an H1 tag near the top of your page
- repeat your keyword or keyphrase 5-10 times throughout the page
- create quality content on your site and update it regularly
- use a site map (linked to from every page) that links to all of your pages
- build lots of relevant links to your site
- ensure that your target keyword or keyphrase is in the link text of your incoming links
Now, however, the best practices for keyword frequency and link text will likely trigger the Google OOP. There is surely no denying that there are many low quality sites have used link farms and spammed blog comments in order to increase their PageRank (Google's measure of site quality) and link popularity. However, a differentiation must be made from these sites and quality sites with dozens or hundreds of pages of informational well-written content that have taken the time to properly build links.
So if you have been affected, what can you do? Should one de-optimize their site, or wait it out? Should one create one site for Google and one for the 'normal engines?' Is this a case of a filter been turned on too tight that Google will fix in a matter of days or something much more?
These are all serious questions that no one seems to have answers to. At this point we recommend making the following changes to your site if, and only if, your rankings seem to have been affected:
1. Contact a few of your link partners via email. Ask them to change the link text so that the keyword you have been penalized for is not in the link text or the keyphrase is in a different order than the order you are penalized for.
2. Open up the page that has been penalized (usually your home page) and reduce the number of times that you have the keyword on your site. Keep the number under 5 times for every 100 words you have on your page.
3. If you are targeting a keyphrase (a multiple-word keyword) reduce the number of times that your page has the target keyphrase in the exact order you are targeting. Mix up the order. For example, if you are targeting "Florida web designer" change this text on your site to "web site designer in florida" and "florida-based web site design services."
It is important to note that these 'de-optimization' steps should only be taken if you know that you have been affected by the Google OOP.
Why did Google do this? There are two possible answers. First, it is possible that Google has simply made an honest (yet very poor) attempt at removing many of the low-quality web sites in their results that had little quality content and received their positions from link farms and spamdexing. The evidence and the search engine results point to another potential answer.
A second theory, which has gained credence in the past days within the industry, is that in preparation for its Initial Public Offering (possibly this Spring), Google has developed a way to increase its revenue. How? By removing many of the sites that are optimized for the search engines on major commerical search terms, thereby increasing the use of its AdWords paid search results (cost-per-click) system. Is this the case? Maybe, maybe not.
Perhaps both of these reasons came into play. Perhaps Google execs thought they could
1) improve the quality of their rankings,
2) remove many of the 'spammy' low-quality sites
3) because of #2, increase AdWords revenues and
4) because of better results and more revenue have a better chance at a successful IPO.
Sadly, for Google, this plan had a detrimental flaw.
What Google Should Do
While there are positives that have come from this OOP filter, the filter needs to be adjusted. Here is what Google should do:
1. Post a communiqué on its web site explaining in as much detail as they are able what they have done and what they are doing to fix it;
2. Reduce the weight of OOP;
3. If the OOP is indeed a static penalty that can only be removed by a human, change it to a dynamic penalty that is analyzed and assessed with each major update; and
4. Establish an appeal process through which site owners which feel they are following all rules and have quality content can have a human (or enlightened spider) review their site and remove the OOP if appropriate.
When this recent update broke on November 15, webmasters clamored in the thousands to the industry forums such as webmasterworld.com. The mis-update was quickly titled "Florida Update 2003" and the initial common wisdom was that Google had made a serious mistake that would be fixed within 3-4 days and everyone should just stay put and wait for Google to 'fix itself.' While the rankings are still dancing, this fix has yet to come. High quality sites with lots of good content that have done everything right are being severely penalized.
If Google does not act quickly, it will soon lose market share and its reputation as the provider of the best search results. With Yahoo's recent acquisition of Inktomi, Alltheweb/FAST, and Altavista, it most likely will soon renege on its deal to serve Google results and may, in the process, create the future "best search engine on the 'net." Google, for now, has gone bananas in its recent meringue, and it may soon be spoiled rotten.
Or deep links, perhaps, since .edu and .gov links dont usually go to commercial sites. Just musing, but I doubt many (a few pros here excluded) take the effort to get deep links for the sake of PR.
Perhaps I don't understand what an authority site is. I would assume authority to mean the most relevent for a particular KW.
No. authoratative!= relevance. An analogy that comes to mind is being very authoratative means having a Ph.D. Relevance would be what that Ph.D is for. If for example I have a Ph.D in economics, just because of that people won't think I am credible discussing nuclear particle physics. High PR + anchor text could merge authority and relevance for ranking. My page's high PR suggests I am knowledgeable. The anchor text other people use to link shows what they think I know a lot about. People likely won't link to an economist's web site with the anchor text of "neutrinos" much. However, my high PR combined with anchor text of "money supply" is more likely.
with reference to my message #69 - can anybody work out what the threshold might be?
There can be no threshold: someone earlier this week used technobabble ;) to say that 'non-linear feedback' was a very bad idea. What the poster was trying to say was that a sensible feedback mechanism can't have a cut-off point, it has to be a sliding scale. (or at least a continuously differentiable function)
In other words, the idea that when, for example, 'keyword density reaches x percent, you're stuffed' doesn't make any sense at all.
There are many parameters involved in producing a worldwide database. It would be prudent to fiddle with only one parameter at a time. But if Google have a simple maths model of the results somewhere, they may have tested this and tweaked with 2 parameters, even 3.
I'm now fairly convinced that keyword stemming is at the root of the problems. And that Google is broken - but that is a relative term.
[edited by: superscript at 12:07 am (utc) on Dec. 4, 2003]
Whilst before you might type in Europe as a search term: you might now get results involving pages referring to Europe, Europe's, Europes, European etc.
someone know if this mean that im banned?
with my main keywords "key1 key2" allinanchor im #1 and allintext #1 too, but the SERP im #390.
someone know if i can optimize it?
i drop too many positions coz all my anchor links are the same? i need to email all my friends to change my anchor link?
A consequence of that is that many sites would quit ranking for their brand name. Pizza Hut must have oodles of anchor text of "Pizza Hut" in inbound links, yet it is still #1. There are just way too many counterexamples to what you suggest above.
An earlier poster used the term non-linear to mean a sudden cut-off point. In other words - "go beyond this parameter and you are finished." This is what I assume he meant.
But this was poor terminology, and extremely unlikely to be correct. In the true sense of the term, non-linear simply refers to a sliding scale whose effects aren't constant.
In other words, the level of penalisation might increase more and more as you approach some arbitrary unnacceptable level.
But I will qualify this - I don't believe in penalisation on Google - Google doesn't believe in penalisation (except in a very rare manual one).
Instead of penalisation, we should think of a reduction in emphasis in one of the aspects of your site in favour of another.
What we should be trying to figure out is what is now favoured, rather than what has been 'penalised'.
The answers have already been provided as far as I can make out.
<self edit: minor grammatical>
I've noticed that some of our sites which have been affected were our newer dynamic ones. The dynamic pages show PR0 using the toolbar, but this of course may not be the value Google uses. Using the visible data These sites effectively have PR6 (The PR of the home page - all others are zero), whereas an older static site we have might have 3000 pages indexed with PR 4 or above. This site does really well regardless of the keywords.
There is one thing that bothers me however. We have other dynamic sites which only have Pagerank only on the homepage which do very well. Could it be that Google has allocated PageRank to our dynamic pages on one site and not another? Unfortunately we have to speculate as dynamic Pagerank is not shown on the Toolbar.
http://www.cs.toronto.edu/~georgem/hilltop/poster.html
I am having trouble fulling comprehending that. However, this does seem to indicate a new way of ranking the "authoratativeness" of a site. This latest update seems to be giving more weight to apparently authoratative sites. Note that one of the authors of that paper currently works for Google. This suggests that this idea was one that the Powers That Be at Google liked.
I remember reading here over a year ago that Google might see certain sites as "hubs". It seemed like wishful thinking more than anything else, but I took it to heart and tried to have incoming and outgoing links that would make our site "authoritative". It essentially is at this point.
I like the idea of siterank, it's about time, (if it's really happening). Maybe that helps to explain why we show up surprisingly high in certain competitive serps, (like #2 out of 110,000 for our weather page, even though it's quite buried, just PR4, and a very minor page on the site).
1. Either "Home Page Rank " or somthing more complicated ("Site Rank") has been made very important. (I lean towards the simpler version.)
2. On-page factors are much less important.
3. Searches with "-madeupword" added to them are using an algo yet to be adjusted.
Prior to word stemming, there was a set of SERPs for 'red widget', another set for 'red widgets', another set for 'red's widgets', another set for 'red widgeting', and perhaps other variations.
Now that word stemming is implemented, there is only set of SERPs, for all variations. So, whereas there were previously 4 or more #1 pages, now there is only 1...
One SERP to rule them all,
One SERP to find them,
One SERP to bring them all and in the darkness bind them.
One SERP to rule them all,
One SERP to find them,
One SERP to bring them all and in the darkness bind them.
Very, very funny!
Mods, feel free to remove this useless post, but that's the most hilarious thing I've read here in ages... I had to reply.
Ok, no more non-productive posts from me, promise. Over and out.
ADDED: My serps don't appear to be affected by stemming... Ages ago, I put titles up on different pages that would cover the variations... they're all showing the same as ever with no stemming changes happening
[edited by: Stefan at 3:17 am (utc) on Dec. 4, 2003]
Still, personally, i believe that there are general "new rules" that affect all pages in the same manner, and that the single most important one is "broad match", which seems to be stemming and then some. Also, i believe that this "broad match" technology is still developing, ie. i think i have seen changes to the better during the last two days, but i don't think it's "finished" yet, if it ever will be. (Added: italics)
Trying to get a better understanding of the concept "broad match", i've been playing a bit around with Google Sets as well as the "related:" query and ~the ~synonym ~operator. Using the latter, i actually managed to get NBA.com top ranking for a special kind of "service for website owners" (two-word query). Neither of these tools are "stemming", but you can't really play with stemming in the same way that you can play with these other tools.
The interesting thing here is not that i managed to get an unrelated site high ranking, but: That the reason(s) that this unrelated site was high ranking was not very far from the reason(s) that would have made an exactly on-topic site rank high pre-Florida.
The exercise might save you $95 it seems, so i'll leave that one to the reader. You will probably discover that Brett's 26 steps still works, here's a link for free: [webmasterworld.com...]
Which reminds me, front pages shouldn't really be considered the most important pages anyway, Brett also wrote a good piece about that once: [searchengineworld.com...] (...no, i'm not suggesting themes - not yet... it's just about site structure, and money, and such)
Well, because you're still ranked just as high/low as you were before - more or less so, that is. Only, you are no longer ranked like that for the default query type. The default query is simply more "intelligent" now than before. Pre-Florida, all it could do was to look for exact matches - believe it or not, that's pretty stupid, and any old search engine can do that.
Now it looks a little broader, it makes some qualified guesses and assumptions and tries to serve you something relevant based on that. The search engine simply realizes that it is a search engine, and that people using such a thing might not always know the exact definition of the thing they are searching for (if they did so, why would they be searching anyway?)
If there are 100,000 sites optimised for 'widget', and suddenly the KW 'widget' now includes all those sites optimised for 'widgeting' - and there are 100,000 of those, now you have twice as many competitors. But the number of sites in the first hundred are fixed (obviously!).
An extreme example might help (it's just an illustration, it's not intended to be realistic, only illustrative):
Say your keywords are "blue widgets". Imagine that "blue" is not only interpreted as blue, but also, say "green", and "turquoise". Or "dark blue", "light blue" etc. Then, imagine that "widgets" are also "gadgets" and "whatsits".
So, you have three kinds of blue and three kinds of widgets. That equals six times as many competing pages as before. And of course, some phrases lend themselves to more variation than others. Add to this that there are not 100, but only 10 places on page one (and my personal "feeling" that google aims to provide a broad yet relevant selection on that first very important page). And, after all, Google only shows top 1,000 anyway.
The more competition these related terms have, the more it takes to end up in the selection. If your page is very specific, or "literally on topic" it will have to be better than it was before in order to achieve the same ranking. The more competition, the better it has to be. (that's really nothing new, you always had to optimize more than your competitors. Why do you think the "widget" industry is so spammy; all the SEO's are there, they've simply raised the bar)
So, now we know why sites have dropped. At least, i think that i know. You might not want to believe what some dude writes in some forum, so do some playing around/testing, the recipe is above.
I phrased the question like that on purpose. Time to get some sleep now, it's 4AM here.
/claus
[edited by: claus at 3:38 am (utc) on Dec. 4, 2003]
It seems that most people build links that 80% say the same thing in the link text such as "<<a>>blue widget<</a>> blah blah blah...".
Another person who gets hurt by this is the purchaser of links. Someone who bought say 2,000 incoming links from a site and all of the incoming links have the same link text of "blue widget"
The succesful sites, as I see it, have more diversity in link text, and a higher number of domains linking to them.
the qualified guesses and assumptions have ended up serving completely irrelevant results in many many cases. i think if this is what google has attempted to accomplish, they need to revert back to exact matches until they can accomplish their goal of providing relevant broad matches.
c'mon phd's, put your tests on a development server and really LOOK at the results rather than playing the "let's test it out in real life first" game.
>>The default query is simply more "intelligent" now than before.<<
That’s open for debate ;-)
If that theory is true, then why is it that 404’s and splash pages, clearly marked as splash pages, show up, and I mean above 1000, before some with more content and rank? And I mean pages that are book marked by users and have inbounds from professional sources like .gov and trade publications?
H.T.F. could inbound anchor text be the root of all evil?
Give me a break!
So you know how easy it would be for me to effect ALL my competitors? Dang all Id have to do is slap up a few thousand links to them using each and every anchor text they wish to target and according to some of your theories this would penalize them and they would vanish!
I dont buy the anchor text theory not even for 10 cents!
As for the "site rank" theory - NOPE dont buy it either seeing too many sites that contain less than a dozen pages with PR of 3s and a HP of PR4 ranking in top 10 for highly competitive terms. I have several PR 8 sites with hundreds of content pages of PR 7 and 6 and Im gone! WHY? Simnple - because I am expected to enter the bidding war and buy adwords!