homepage Welcome to WebmasterWorld Guest from 54.167.173.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 260 message thread spans 9 pages: < < 260 ( 1 2 3 4 5 6 [7] 8 9 > >     
Google's Florida Update - a fresh look
We've been around the houses - why not technical difficulties?
superscript




msg:212181
 10:20 pm on Dec 12, 2003 (gmt 0)

For the past four or five weeks, some of the greatest (and leastest) Internet minds (I include myself in the latter) have been trying to figure out what has been going on with Google.

We have collectively lurched between one conspiracy theory and another - got ourseleves in to a few disagreements - but essentially found ourselves nowhere!

Theories have involved Adwords (does anyone remember the 'dictionary' concept - now past history.)

And Froogle...

A commercial filter, an OOP filter, a problem caused by mistaken duplicate content, theories based on the contents of the Directory (which is a mess), doorway pages (my fault mainly!) etc. etc.

Leading to the absurd concept that you might be forced to de-optimise, in order to optimise.

Which is a form of optimisation in itself.

But early on, someone posted a reference to Occam and his razor.

Perhaps - and this might sound too simple! - Google is experiencing difficulties.

Consider this, if Google is experiencing technical difficulties regarding the sheer number of pages to be indexed, then the affected pages will be the ones with many SERPs to sort. And the pages with many SERPs to sort are likely to be commercial ones - because there is so much competition.

So the proposal is this:

There is no commercial filter, there is no Adwords filter -Google is experiencing technical difficulties in a new algo due to the sheer number of pages to be considered in certain areas. On page factors havbe suffered, and the result is Florida.

You are all welcome to shoot me down in flames - but at least it is a simple solution.


 

too much information




msg:212361
 8:46 pm on Dec 17, 2003 (gmt 0)

Here's the problem with the whole conspiracy theory thing.

I have a site that was hammered for it's targeted keyword combination. It's still gone for that search, but it's #1 for a search on the topic.

Location Market Widgeter - Gone from SERPs
Location Market Widgetry - #1

The thing is that the page discusses how "Location Market Widgeter" has been doing Widgetry in this Location for that Market for --- years, etc.

So the page really is more relevant for the second set of keywords.

Not only that but the same page is also top 5 for "Location Widgetry" and top 10 for "Location Market" where it never appeared above page 3 for these if it showed at all.

If your site is 'gone' maybe you should try a topical search to see if it is just somewhere else. (Somewhere that nobody looks)

If you are a lazy 'Joe Surfer' looking at your page, and someone asks "What is that page about" what terms would you use? (Not what terms would you want people to use)

I like the CIRCA theory better than the filter and commercial term theories. At least from what I'm seeing.

Kirby




msg:212362
 9:52 pm on Dec 17, 2003 (gmt 0)

Looking for some feedback here. How does CIRCA weigh or take into account anchor text? Does CIRCA use backlinks to aid in theming?

Hissingsid




msg:212363
 10:18 pm on Dec 17, 2003 (gmt 0)

Looking for some feedback here. How does CIRCA weigh or take into account anchor text? Does CIRCA use backlinks to aid in theming?

We've only just worked out that it is a strong candidate for main culprit in the post Florida ranking changes. Exactly how it works is a second tier debate.

Brett might even have thrown us a red herring.

If anyone is completely convinced that CIRCA is approximately the answer and has worked out what factors are used in the weighting of context and meaning I would like to buy you a pint.

Best wishes

Sid

PS I'm 95% convinced that CIRCA is the culprit, still got to investigate my own competitor success stories to find the answer and when I do I'm going to buy myself several pints. To get this absolutely in context by pint I'm meaning 1 imperial pint of an alchoholic brew of malted barley and hops with a specific gravity of approximately 4. Beer, bitter, ale et al.

ronin




msg:212364
 10:37 pm on Dec 17, 2003 (gmt 0)

I have a site that was hammered for it's targeted keyword combination. It's still gone for that search, but it's #1 for a search on the topic.

This sounds all too familiar. I have a page - the only one, as far as I could tell - which is very specific to... oh I don't know, let's say: "encouraging nasal hair growth".

It used to come up somewhere on page one for this search term. Now it's a number of pages down for this term, replaced by SERPS which are less relevant, but, oddly, it now comes up on the first page for "encouraging hair growth" which - arguably - isn't the same thing at all and those looking for it might find that particular page a little off topic for their needs.

Oh well... off to research and write some content.

Philosopher




msg:212365
 11:03 pm on Dec 17, 2003 (gmt 0)

To get this absolutely in context

Too funny...

Bobby




msg:212366
 11:25 pm on Dec 17, 2003 (gmt 0)

Ronin,

I have a very similar situation to what you described (although it absolutely has nothing to do with encouraging nasal hair growth!).

By removing one word, count them 1, my site comes back to the number 2 spot.

I'm looking at a 3 word phrase, 2 of which are nearly always associated together and cannot be separated, the 3rd of which is my country. If I throw in a 4th word like my city I still get no play, but a 5th word brings me back.

Do you get the same when adding 2 words?

newwebster




msg:212367
 1:45 am on Dec 18, 2003 (gmt 0)

I have a 3 word keyword phrase that I used to be ranked #1 for, of which the last word in the phrase is in present tense: i.e. "ing" at the end. Now I no longer appear anywhere for this phrase, however by just changing the last keyword from present tense to past: i.e. "ed" puts me back at #1. Of course the past tense version does not get serched very much. Is stemming to blame for this? Or, is it that CIRCA can not handle the stemming?

Also, if CIRCA is in play in my industry, then it is not doing very well. I would expect to see more related information regarding the most competive keywords. In this, I mean I would expect to see informative sites that go into making the products and services that would fall under these token keywords. I.e. if the keyword were TV(not my industry), then I would expect to see infomation on how TV's are made, history of the TV, veiwing habbits, Nelson ratings, etc. This is not happining for my industry, all I see is a bunch of sites that are either .edu, .gov, and several .com that only mention the keywords in the title with some text in the body and are only marginally or abstractly on topic. These sites have no real value to the surfer. There are a few sites remaing from pre-Florida that are exact matches which are commercial and I am trying to figure out why they still remain. Additionally, like allot of serches there are the business.com, Amozon.com, etc. directories also listed.

I do think that at this point CIRCA weighting is the most likely explaination. However, it only seems to effect certain keyword combinations and does not seem to be producing what I think Google wanted as the end result for certain competive keword combinations. Either they are going to pull the plug on this, or we are in for a long ride as work on getting it right.

ronin




msg:212368
 3:44 am on Dec 18, 2003 (gmt 0)

Bobby,

Yes, I'm seeing exactly that with my phrase:

Take one word away: back on the first page.
Add two words: back on the first page.

Add one word: not much of an effect.

I think there are lots of factors at work here though.
Once you type in a phrase with lots of words you are narrowing down the query a lot.

I still think Google is trying ascertain if the searcher know what they're looking for or if they are searching on a rough topic.

If they are searching roughly, Google returns broadmatched results in an attempt to cover all bases and be the search engine that "always gets it right" / "reads my mind" etc.

If Google thinks the searcher is looking for something specific it will return results without broadmatching.

I'm beginning to conclude there isn't much you can do about this if you write for a niche topic (which I do, although, of course it has nothing to do with nasal hair either).

The brutal truth is, most of the searching public don't know how or aren't used to making narrow searches, so the majority of Google queries will always take the form of topic searches.

Where these broad topic searches used to bring up a handful of specific on-topic sites, they now return a wider range of non-commercial, academic, commercial, index, editorial and comparison sites... so perhaps it's no longer possible for a niche site to be optimised for broad queries - only for specific ones which are entered by a far smaller percentage of searchers?

Though that still doesn't explain why taking one word away puts my page back in the top ten...

rfgdxm1




msg:212369
 3:55 am on Dec 18, 2003 (gmt 0)

>If they are searching roughly, Google returns broadmatched results in an attempt to cover all bases and be the search engine that "always gets it right" / "reads my mind" etc.

>If Google thinks the searcher is looking for something specific it will return results without broadmatching.

Definitely true with stemming. It only comes into play when Google thinks the search is specific. For example, for a search on "cat", because that can mean more than felines (#3 is caterpillar.com) it takes it literally. However, for "Manx cat" not only is cats highlighted, but also cattery. Google picks up that you are searching about a specific type of domestic feline, and allows stemming.

fastterm




msg:212370
 4:29 am on Dec 18, 2003 (gmt 0)

Hello,
I'm new here so cut me some slack please. The sites I manage got hit hard in this Florida Google thing. I mean from page 1 position 4 to 8 for most of my key words to can't be found. Any Idea's on how to come back. I am starting to see some other sites that were dropped coming back but, the site's I manage are still gone.

quotations




msg:212371
 4:47 am on Dec 18, 2003 (gmt 0)

Is

encouraging nasal hair growth

a new way to say

widgeting widgettey wideted widgetry?

Is so, then

I have a very different result for one site which we shall call:

www.encouraging-nasal-hair-growth.com

With the -nonsensestring when that was working, it was first in the SERPs for all 2, 3, and 4 word combinations of the words

encouraging nasal hair growth

Any search without the -nonsensestring made the site completely invisible.

The only way to make it show up with any of those four holiday-related words is to search for

encouraging-nasal-hair-growth

including the dashes.

This site is buying Adwords and has been buying them for quite some time. It is also an Adsense site.

How does that work with this conspiracy theory?

Essex_boy




msg:212372
 8:51 am on Dec 18, 2003 (gmt 0)

Well now, in may this year one of my sites was knocked for six right out of Google, did I swear?

Anyhow this time it appears that most of my competition has been knocked for six! Ho ho ho.

All I have ranking under me are infocommercial sites none are selling the goods..

I love florida! Thanks Google.

Excel




msg:212373
 9:06 am on Dec 18, 2003 (gmt 0)

We've been around the houses

Many are still going around and never learn.

Hissingsid




msg:212374
 9:09 am on Dec 18, 2003 (gmt 0)

if the keyword were TV(not my industry), then I would expect to see infomation on how TV's are made, history of the TV, veiwing habbits, Nelson ratings, etc.

Hi Newebmaster,

The Ontology contains semantic variations of words together with closely linked words and phrases and it understands that some words have completely different meanings for that word. In the paper it gives the example of Java. Java can be a programming language, an Island or a coffee. So it has rules that say if it is linked to these words, program, script, code etc it must mean the programming language and it can then look for other words and phrases, for which it has rules which are associated with programming languages.

The point is that it understands words and phrases and which ones are most stronglt linked but it does not understand the subject nor can it nake a subjective decision on what is a good page on that subject or a bad one.

The way that I visualise what it turns a search term into it is a bit like those ball and spring models of molecules. With a small number of closely linked balls (atoms) at the middle and other atoms floating around with weaker bonds both to the nucleus and to each other. It then looks for pages that seem to have those molecules (patterns of words) in them. It decides that a page is the right material if it is made up of molecules that match the model that it has created from the search term. I suspect that ranking is based on links to and from other pages that are made of the same material.

If the page has too much nucleus ie repitition of the exact keywords it is too dense and does not look like the model that has been extracted from the Ontology. If it does not contain the exact term searched for it may be abetter match than one that contains too many of the actual term searched for because the search is now looking to match the whole molecule like model and not just those keywords.

If anyone thinks that I'm barking up the wrong tree pleae say so. I'm just going to search for barking up the wrong tree to see if it thinks I want pages on lumber extraction or commonly used English expression. Well it got that one right in the organic results but Adwords is another matter ;)

Best wishes

Sid

Wrote in a hurry appologies for any gramatical and punctuation errors.

Bobby




msg:212375
 9:28 am on Dec 18, 2003 (gmt 0)

If the page has too much nucleus ie repitition of the exact keywords it is too dense and does not look like the model that has been extracted from the Ontology

Hi Sid, I have a site which now holds the top spot for a 3-word phrase similar to blue widget companies where it previously was at the top for the singular version blue widget company.

Both the word company and the word companies appear exactly twice on the page (so I don't think density is an issue here)
HOWEVER
the singular word additionally appears in the header tags (title, description and keywords).

What do you make of it?

BTW...did you get my last sticky mail?

merlin30




msg:212376
 9:31 am on Dec 18, 2003 (gmt 0)

"The point is that it understands words and phrases and which ones are most stronglt linked but it does not understand the subject nor can it nake a subjective decision on what is a good page on that subject or a bad one."

I totally agree and that is why I think that *real* Pagerank is more important than ever. I understand real Pagerank to be what Google sees internally not that which it shows on the toolbar. Real Pagerank may actually be made up of contextual components that only count for particular searches. This would explain why sites with low *toolbar* PR beat those with high *toolbar* PR. The site with low PR is deemed to have better contextual rank.

So, I think backlinks are more important thann ever - but they have to be (or appear to be) natural. I don't think reciprocal linnks are being discounted per se, but if you do reciprocal links they are going to have to be to and from (broadly) relevant pages.

The assumption that a search engine must make about any page it finds is that the page contains mostly nonsense and is of little value - until *reliable* evidence suggests otherwise.

kaled




msg:212377
 12:33 pm on Dec 18, 2003 (gmt 0)

I said months ago that I thought Google was moving towards page analysis and using that with backlinks rather than just anchor text - nobody believed me.

Algos don't "understand" anything. An algo is just a set of rules and instructions. Errors in algos are very common and so is faulty data. If you use the word "understand" you imply that an intelligence is at work but that is not the case.

Kaled.

superscript




msg:212378
 12:53 pm on Dec 18, 2003 (gmt 0)

So, I think backlinks are more important than ever - but they have to be (or appear to be) natural

I have a lot of control over 400+ PR6 backlinks from related site(s). Anchor text is based on keywords - it was, after all, done pre-Florida.

What do folks think, as an experiment should I try out Merlin's sensible hypothesis, and make all the links point vaguely rather than specifically towards my site (I've even considered simply using the thesaurus in MS Word!)

A good idea and valuable experiment, or risky idea - what's the consensus? (I guess it can't hurt my PR, which is doing me little good anyway)

tantalus




msg:212379
 12:57 pm on Dec 18, 2003 (gmt 0)

Go for it....But to be on the safe side I'll sticky you the url that they should point to, if thats ok :)

Seriously...I think Merlin is wrong about anchor text needing natural language etc but it is more about it helping Google to define the meaning/concept of the page for CIRCA to be applied to, if that makes sense.

Bobby




msg:212380
 1:19 pm on Dec 18, 2003 (gmt 0)

Superscript,

That would certainly be a useful experiment and something all the other webmasters would love you to report on, but first consider whether or not you could be hurting your site and/or business.

As soon as -in settles down I plan on inverting the singular and plural nouns in the example I gave earlier to see how this affects the results but first I want to make sure that things are stable so I can be sure that any eventual changes in SERPs weren't already in the making.

Regarding PR and anchor links I believe 2 things:

  • Number of links directly affects PR
  • Words in and around the links contribute to better SERPs

    I would also guess (but this is pure speculation) that keywords in imporant tags of linking page also contribute to better SERPs, so if one of your 400 pages has a title tag like "green monsters" and links to another page with a title tag of "the history of monsters" your site will appear higher.

  • HocusPocus




    msg:212381
     2:11 pm on Dec 18, 2003 (gmt 0)

    Been lurking for too long and read far too many whinging posts. Time to contribute, or fog the issue, depending on your viewpoint.

    I was interested in the Cat and Manx Cat searches. To recap, Google doesn't know what a Cat is a so returns site about domestic cats, cat scans, caterpillar tyres etc. Google for Manx Cat shows 'stemmed' words e.g. cat, cats, cattery etc.

    Because itís showing stemmed word my thoughts are that Google has learnt that a Manx cat is a kind of cat (domestic pussy), therefore it highlights the words related to Cat, the domestic type.

    Lets say Google the machine has learnt that Christmas Gift is a specific type of gift. Is it returning Christmas Gifts and Gift sites because of this? Is it showing off by showing stemmed forms of Gift, i.e., highlighting gift and gifts because it knows that search you did means give me a specific type of Gift.

    Keyboard Gift does not return broad/stemmed matches. Google doesn't yet know that a Keyboard Gift is a type of Gift. So it doesn't highlight Gifts and Gifts. Why doesn't it know? I don't know ...

    superscript




    msg:212382
     2:25 pm on Dec 18, 2003 (gmt 0)

    Bobby,

    thanks - it is indeed a risky tactic - particularly if Google does an about-turn*. Much appreciated. I'll hold fire.

    HocusPocus,

    A really interesting post. As you know, Florida was really in two parts: Florida itself, and then later what the dispossessed call the Florida Massacre. If you're correct, then as Google learns more and more words, the effects of the massacre arguably will become even more widespread, and perhaps spread into non-commercial SERPs. Sid has already pointed out that its 'vocabularly' seems to be limited to US English.

    There may be some changes of heart, from those who previously admired Florida, if the strange algo spreads further. That will be interesting to observe ;)

    Edit:
    * an about-turn is unlikely - but a softening is possible, even necessary!

    merlin30




    msg:212383
     2:55 pm on Dec 18, 2003 (gmt 0)

    Hi Tantalus,

    I agree with you about a link helping to define the meaning of a page (indeed, I think this is the THE most central issue). What I meant by natural link text was having links with a variety of language, not 1000 links with a repeated string. That still allows for any individual link to have a staccato keyword type phrase as its anchor text; I'm just looking for lots of phrase variations between the links. On a very large scale this is hard to synthesise as there are some ways to describe your page that even you may not have thought of.

    tantalus




    msg:212384
     3:05 pm on Dec 18, 2003 (gmt 0)

    Merlin

    I'm just wondering if its more fruitful to have the one keyword in the anchor text to represent or express the overall meaning/concept of the page with its tokens scattered within the page to support it.

    I have also noticed if a keyword is in the title with other 'tokens' as well as just the one in the anchor it goes along way with google.

    vbjaeger




    msg:212385
     3:05 pm on Dec 18, 2003 (gmt 0)

    There may be some changes of heart, from those who previously admired Florida, if the strange algo spreads further. That will be interesting to see ;)

    We were hit pretty hard and are now bouncing around the datacenters like mexican jumping beans. I hope it ends soon and no more sites/companies are hurt by this.

    Hissingsid




    msg:212386
     3:05 pm on Dec 18, 2003 (gmt 0)

    Hi,

    You may remember that my widget in the UK is the brand for a different sort of thing altogether in the US. As a result Google is associating my widget with the wrong thing and is ranking what are IMHO the wrong sites at the top of SERPs so I've decided to attack it head on.

    Yesterday I did a search using Google's Adwords suggestion tool. I took the top few suggestions for terms associated with one of the words that I'm targetting in the US English list and I've made a quick and dirty web site devoted to this with a smattering of the UK English generic meanings in the site.

    I've linked to a few very high ranking pages on the subject and to the new page I've made on my main site, also on the American gist of the subject.

    I've put a link to it on a couple of pages that I control that are indexed by Google and have submitted these for good measure. I'm now going to try and get links on a couple of other pages and I'll report back on the results if anything develops.

    Best wishes

    Sid

    Hissingsid




    msg:212387
     3:38 pm on Dec 18, 2003 (gmt 0)

    If Florida was a result of the fight against spam, it failed. The results are not always better on ATW, but I see fewer duplicate results and fewer pure spam sites. If the public knew about ATW (and ATW could handle the traffic) right now Google would be in big, VERY BIG, trouble.

    Hi Kaled,

    FWIW I think that the motivation is more financial than altruistic and probably more directly financial than many of us imagine. If you look at many (all?) of the new money making concepts that Google is introducing they all rely on CIRCA technology. For example CIRCA targets ads at pages in DomainPark, and Adsense.

    If the main Google search engine uses an entirely different technology to create results then it is less likely to throw up pages with properly targetted ads on them than if it used the same technology. Properly targetted ads get a much higher click through rate (I seem to remember reading on a page about Adsense).

    This is just the start of what Google has planned.

    Best wishes

    Sid

    merlin30




    msg:212388
     3:41 pm on Dec 18, 2003 (gmt 0)

    "As soon as -in settles down"

    Certainly, it is always best to do investigations from a known starting point; however, I am not sure that we can now ever be sure that things have settled down. I'm preparing for a very unsettled outlook.

    kaled




    msg:212389
     4:09 pm on Dec 18, 2003 (gmt 0)

    Sid, you may be right. Perhaps Google has traded in its white hat. If they have, their share of the search market will fall signicantly over 2004.

    Maybe there'll be a new player in town by 2005. I seem to recall a discussion about an IBM search engine but I think it was supposed to be using similar semantic technology. Certainly, if I were Google, I'd be more scared of IBM than Microsoft. IBM is the real deal when it comes to innovation.

    Kaled.

    rfgdxm1




    msg:212390
     5:20 pm on Dec 18, 2003 (gmt 0)

    >IBM is the real deal when it comes to innovation.

    They sure blew it with the PC using an open architechture that allowed cheap clones to dominate the market. ;) And I wouldn't underestimate MSN. Whatever you think of Windows, they sure got rich selling it to lots of folks.

    Kirby




    msg:212391
     5:27 pm on Dec 18, 2003 (gmt 0)

    >>If Google thinks the searcher is looking for something specific it will return results without broadmatching.

    Definitely true with stemming. It only comes into play when Google thinks the search is specific. For example, for a search on "cat", because that can mean more than felines (#3 is caterpillar.com) it takes it literally. However, for "Manx cat" not only is cats highlighted, but also cattery. Google picks up that you are searching about a specific type of domestic feline, and allows stemming.

    This still means that content is king. The broader the subject, the more important addition specific content will be.

    >Perhaps Google has traded in its white hat. If they have, their share of the search market will fall signicantly over 2004.

    Perhaps we are missing the obvious here. Google already knows that their market share will change in 2004 with Y! dropping Google at some point and the possibilty of M$N's new foray into search. Acknowledging this fact, it makes sense for Google to carve out new ground and move forward with new technology.

    This 260 message thread spans 9 pages: < < 260 ( 1 2 3 4 5 6 [7] 8 9 > >
    Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Google / Google News Archive
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved