Welcome to WebmasterWorld Guest from 126.96.36.199
Forum Moderators: open
We have collectively lurched between one conspiracy theory and another - got ourseleves in to a few disagreements - but essentially found ourselves nowhere!
Theories have involved Adwords (does anyone remember the 'dictionary' concept - now past history.)
A commercial filter, an OOP filter, a problem caused by mistaken duplicate content, theories based on the contents of the Directory (which is a mess), doorway pages (my fault mainly!) etc. etc.
Leading to the absurd concept that you might be forced to de-optimise, in order to optimise.
Which is a form of optimisation in itself.
But early on, someone posted a reference to Occam and his razor.
Perhaps - and this might sound too simple! - Google is experiencing difficulties.
Consider this, if Google is experiencing technical difficulties regarding the sheer number of pages to be indexed, then the affected pages will be the ones with many SERPs to sort. And the pages with many SERPs to sort are likely to be commercial ones - because there is so much competition.
So the proposal is this:
There is no commercial filter, there is no Adwords filter -Google is experiencing technical difficulties in a new algo due to the sheer number of pages to be considered in certain areas. On page factors havbe suffered, and the result is Florida.
You are all welcome to shoot me down in flames - but at least it is a simple solution.
I have a site that was hammered for it's targeted keyword combination. It's still gone for that search, but it's #1 for a search on the topic.
Location Market Widgeter - Gone from SERPs
Location Market Widgetry - #1
The thing is that the page discusses how "Location Market Widgeter" has been doing Widgetry in this Location for that Market for --- years, etc.
So the page really is more relevant for the second set of keywords.
Not only that but the same page is also top 5 for "Location Widgetry" and top 10 for "Location Market" where it never appeared above page 3 for these if it showed at all.
If your site is 'gone' maybe you should try a topical search to see if it is just somewhere else. (Somewhere that nobody looks)
If you are a lazy 'Joe Surfer' looking at your page, and someone asks "What is that page about" what terms would you use? (Not what terms would you want people to use)
I like the CIRCA theory better than the filter and commercial term theories. At least from what I'm seeing.
Looking for some feedback here. How does CIRCA weigh or take into account anchor text? Does CIRCA use backlinks to aid in theming?
We've only just worked out that it is a strong candidate for main culprit in the post Florida ranking changes. Exactly how it works is a second tier debate.
Brett might even have thrown us a red herring.
If anyone is completely convinced that CIRCA is approximately the answer and has worked out what factors are used in the weighting of context and meaning I would like to buy you a pint.
PS I'm 95% convinced that CIRCA is the culprit, still got to investigate my own competitor success stories to find the answer and when I do I'm going to buy myself several pints. To get this absolutely in context by pint I'm meaning 1 imperial pint of an alchoholic brew of malted barley and hops with a specific gravity of approximately 4. Beer, bitter, ale et al.
I have a site that was hammered for it's targeted keyword combination. It's still gone for that search, but it's #1 for a search on the topic.
This sounds all too familiar. I have a page - the only one, as far as I could tell - which is very specific to... oh I don't know, let's say: "encouraging nasal hair growth".
It used to come up somewhere on page one for this search term. Now it's a number of pages down for this term, replaced by SERPS which are less relevant, but, oddly, it now comes up on the first page for "encouraging hair growth" which - arguably - isn't the same thing at all and those looking for it might find that particular page a little off topic for their needs.
Oh well... off to research and write some content.
I have a very similar situation to what you described (although it absolutely has nothing to do with encouraging nasal hair growth!).
By removing one word, count them 1, my site comes back to the number 2 spot.
I'm looking at a 3 word phrase, 2 of which are nearly always associated together and cannot be separated, the 3rd of which is my country. If I throw in a 4th word like my city I still get no play, but a 5th word brings me back.
Do you get the same when adding 2 words?
Also, if CIRCA is in play in my industry, then it is not doing very well. I would expect to see more related information regarding the most competive keywords. In this, I mean I would expect to see informative sites that go into making the products and services that would fall under these token keywords. I.e. if the keyword were TV(not my industry), then I would expect to see infomation on how TV's are made, history of the TV, veiwing habbits, Nelson ratings, etc. This is not happining for my industry, all I see is a bunch of sites that are either .edu, .gov, and several .com that only mention the keywords in the title with some text in the body and are only marginally or abstractly on topic. These sites have no real value to the surfer. There are a few sites remaing from pre-Florida that are exact matches which are commercial and I am trying to figure out why they still remain. Additionally, like allot of serches there are the business.com, Amozon.com, etc. directories also listed.
I do think that at this point CIRCA weighting is the most likely explaination. However, it only seems to effect certain keyword combinations and does not seem to be producing what I think Google wanted as the end result for certain competive keword combinations. Either they are going to pull the plug on this, or we are in for a long ride as work on getting it right.
Yes, I'm seeing exactly that with my phrase:
Take one word away: back on the first page.
Add two words: back on the first page.
Add one word: not much of an effect.
I think there are lots of factors at work here though.
Once you type in a phrase with lots of words you are narrowing down the query a lot.
I still think Google is trying ascertain if the searcher know what they're looking for or if they are searching on a rough topic.
If they are searching roughly, Google returns broadmatched results in an attempt to cover all bases and be the search engine that "always gets it right" / "reads my mind" etc.
If Google thinks the searcher is looking for something specific it will return results without broadmatching.
I'm beginning to conclude there isn't much you can do about this if you write for a niche topic (which I do, although, of course it has nothing to do with nasal hair either).
The brutal truth is, most of the searching public don't know how or aren't used to making narrow searches, so the majority of Google queries will always take the form of topic searches.
Where these broad topic searches used to bring up a handful of specific on-topic sites, they now return a wider range of non-commercial, academic, commercial, index, editorial and comparison sites... so perhaps it's no longer possible for a niche site to be optimised for broad queries - only for specific ones which are entered by a far smaller percentage of searchers?
Though that still doesn't explain why taking one word away puts my page back in the top ten...
>If Google thinks the searcher is looking for something specific it will return results without broadmatching.
Definitely true with stemming. It only comes into play when Google thinks the search is specific. For example, for a search on "cat", because that can mean more than felines (#3 is caterpillar.com) it takes it literally. However, for "Manx cat" not only is cats highlighted, but also cattery. Google picks up that you are searching about a specific type of domestic feline, and allows stemming.
encouraging nasal hair growth
a new way to say
widgeting widgettey wideted widgetry?
Is so, then
I have a very different result for one site which we shall call:
With the -nonsensestring when that was working, it was first in the SERPs for all 2, 3, and 4 word combinations of the words
encouraging nasal hair growth
Any search without the -nonsensestring made the site completely invisible.
The only way to make it show up with any of those four holiday-related words is to search for
including the dashes.
This site is buying Adwords and has been buying them for quite some time. It is also an Adsense site.
How does that work with this conspiracy theory?
Anyhow this time it appears that most of my competition has been knocked for six! Ho ho ho.
All I have ranking under me are infocommercial sites none are selling the goods..
I love florida! Thanks Google.
We've been around the houses
Many are still going around and never learn.
if the keyword were TV(not my industry), then I would expect to see infomation on how TV's are made, history of the TV, veiwing habbits, Nelson ratings, etc.
The Ontology contains semantic variations of words together with closely linked words and phrases and it understands that some words have completely different meanings for that word. In the paper it gives the example of Java. Java can be a programming language, an Island or a coffee. So it has rules that say if it is linked to these words, program, script, code etc it must mean the programming language and it can then look for other words and phrases, for which it has rules which are associated with programming languages.
The point is that it understands words and phrases and which ones are most stronglt linked but it does not understand the subject nor can it nake a subjective decision on what is a good page on that subject or a bad one.
The way that I visualise what it turns a search term into it is a bit like those ball and spring models of molecules. With a small number of closely linked balls (atoms) at the middle and other atoms floating around with weaker bonds both to the nucleus and to each other. It then looks for pages that seem to have those molecules (patterns of words) in them. It decides that a page is the right material if it is made up of molecules that match the model that it has created from the search term. I suspect that ranking is based on links to and from other pages that are made of the same material.
If the page has too much nucleus ie repitition of the exact keywords it is too dense and does not look like the model that has been extracted from the Ontology. If it does not contain the exact term searched for it may be abetter match than one that contains too many of the actual term searched for because the search is now looking to match the whole molecule like model and not just those keywords.
If anyone thinks that I'm barking up the wrong tree pleae say so. I'm just going to search for barking up the wrong tree to see if it thinks I want pages on lumber extraction or commonly used English expression. Well it got that one right in the organic results but Adwords is another matter ;)
Wrote in a hurry appologies for any gramatical and punctuation errors.
If the page has too much nucleus ie repitition of the exact keywords it is too dense and does not look like the model that has been extracted from the Ontology
Hi Sid, I have a site which now holds the top spot for a 3-word phrase similar to blue widget companies where it previously was at the top for the singular version blue widget company.
Both the word company and the word companies appear exactly twice on the page (so I don't think density is an issue here)
the singular word additionally appears in the header tags (title, description and keywords).
What do you make of it?
BTW...did you get my last sticky mail?
I totally agree and that is why I think that *real* Pagerank is more important than ever. I understand real Pagerank to be what Google sees internally not that which it shows on the toolbar. Real Pagerank may actually be made up of contextual components that only count for particular searches. This would explain why sites with low *toolbar* PR beat those with high *toolbar* PR. The site with low PR is deemed to have better contextual rank.
So, I think backlinks are more important thann ever - but they have to be (or appear to be) natural. I don't think reciprocal linnks are being discounted per se, but if you do reciprocal links they are going to have to be to and from (broadly) relevant pages.
The assumption that a search engine must make about any page it finds is that the page contains mostly nonsense and is of little value - until *reliable* evidence suggests otherwise.
Algos don't "understand" anything. An algo is just a set of rules and instructions. Errors in algos are very common and so is faulty data. If you use the word "understand" you imply that an intelligence is at work but that is not the case.
So, I think backlinks are more important than ever - but they have to be (or appear to be) natural
I have a lot of control over 400+ PR6 backlinks from related site(s). Anchor text is based on keywords - it was, after all, done pre-Florida.
What do folks think, as an experiment should I try out Merlin's sensible hypothesis, and make all the links point vaguely rather than specifically towards my site (I've even considered simply using the thesaurus in MS Word!)
A good idea and valuable experiment, or risky idea - what's the consensus? (I guess it can't hurt my PR, which is doing me little good anyway)
Seriously...I think Merlin is wrong about anchor text needing natural language etc but it is more about it helping Google to define the meaning/concept of the page for CIRCA to be applied to, if that makes sense.
That would certainly be a useful experiment and something all the other webmasters would love you to report on, but first consider whether or not you could be hurting your site and/or business.
As soon as -in settles down I plan on inverting the singular and plural nouns in the example I gave earlier to see how this affects the results but first I want to make sure that things are stable so I can be sure that any eventual changes in SERPs weren't already in the making.
Regarding PR and anchor links I believe 2 things:
I would also guess (but this is pure speculation) that keywords in imporant tags of linking page also contribute to better SERPs, so if one of your 400 pages has a title tag like "green monsters" and links to another page with a title tag of "the history of monsters" your site will appear higher.
I was interested in the Cat and Manx Cat searches. To recap, Google doesn't know what a Cat is a so returns site about domestic cats, cat scans, caterpillar tyres etc. Google for Manx Cat shows 'stemmed' words e.g. cat, cats, cattery etc.
Because itís showing stemmed word my thoughts are that Google has learnt that a Manx cat is a kind of cat (domestic pussy), therefore it highlights the words related to Cat, the domestic type.
Lets say Google the machine has learnt that Christmas Gift is a specific type of gift. Is it returning Christmas Gifts and Gift sites because of this? Is it showing off by showing stemmed forms of Gift, i.e., highlighting gift and gifts because it knows that search you did means give me a specific type of Gift.
Keyboard Gift does not return broad/stemmed matches. Google doesn't yet know that a Keyboard Gift is a type of Gift. So it doesn't highlight Gifts and Gifts. Why doesn't it know? I don't know ...
thanks - it is indeed a risky tactic - particularly if Google does an about-turn*. Much appreciated. I'll hold fire.
A really interesting post. As you know, Florida was really in two parts: Florida itself, and then later what the dispossessed call the Florida Massacre. If you're correct, then as Google learns more and more words, the effects of the massacre arguably will become even more widespread, and perhaps spread into non-commercial SERPs. Sid has already pointed out that its 'vocabularly' seems to be limited to US English.
There may be some changes of heart, from those who previously admired Florida, if the strange algo spreads further. That will be interesting to observe ;)
* an about-turn is unlikely - but a softening is possible, even necessary!
I agree with you about a link helping to define the meaning of a page (indeed, I think this is the THE most central issue). What I meant by natural link text was having links with a variety of language, not 1000 links with a repeated string. That still allows for any individual link to have a staccato keyword type phrase as its anchor text; I'm just looking for lots of phrase variations between the links. On a very large scale this is hard to synthesise as there are some ways to describe your page that even you may not have thought of.
I'm just wondering if its more fruitful to have the one keyword in the anchor text to represent or express the overall meaning/concept of the page with its tokens scattered within the page to support it.
I have also noticed if a keyword is in the title with other 'tokens' as well as just the one in the anchor it goes along way with google.
There may be some changes of heart, from those who previously admired Florida, if the strange algo spreads further. That will be interesting to see ;)
We were hit pretty hard and are now bouncing around the datacenters like mexican jumping beans. I hope it ends soon and no more sites/companies are hurt by this.
You may remember that my widget in the UK is the brand for a different sort of thing altogether in the US. As a result Google is associating my widget with the wrong thing and is ranking what are IMHO the wrong sites at the top of SERPs so I've decided to attack it head on.
Yesterday I did a search using Google's Adwords suggestion tool. I took the top few suggestions for terms associated with one of the words that I'm targetting in the US English list and I've made a quick and dirty web site devoted to this with a smattering of the UK English generic meanings in the site.
I've linked to a few very high ranking pages on the subject and to the new page I've made on my main site, also on the American gist of the subject.
I've put a link to it on a couple of pages that I control that are indexed by Google and have submitted these for good measure. I'm now going to try and get links on a couple of other pages and I'll report back on the results if anything develops.
If Florida was a result of the fight against spam, it failed. The results are not always better on ATW, but I see fewer duplicate results and fewer pure spam sites. If the public knew about ATW (and ATW could handle the traffic) right now Google would be in big, VERY BIG, trouble.
FWIW I think that the motivation is more financial than altruistic and probably more directly financial than many of us imagine. If you look at many (all?) of the new money making concepts that Google is introducing they all rely on CIRCA technology. For example CIRCA targets ads at pages in DomainPark, and Adsense.
If the main Google search engine uses an entirely different technology to create results then it is less likely to throw up pages with properly targetted ads on them than if it used the same technology. Properly targetted ads get a much higher click through rate (I seem to remember reading on a page about Adsense).
This is just the start of what Google has planned.
Maybe there'll be a new player in town by 2005. I seem to recall a discussion about an IBM search engine but I think it was supposed to be using similar semantic technology. Certainly, if I were Google, I'd be more scared of IBM than Microsoft. IBM is the real deal when it comes to innovation.