homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 260 message thread spans 9 pages: < < 260 ( 1 2 3 4 5 6 7 8 [9]     
Google's Florida Update - a fresh look
We've been around the houses - why not technical difficulties?

 10:20 pm on Dec 12, 2003 (gmt 0)

For the past four or five weeks, some of the greatest (and leastest) Internet minds (I include myself in the latter) have been trying to figure out what has been going on with Google.

We have collectively lurched between one conspiracy theory and another - got ourseleves in to a few disagreements - but essentially found ourselves nowhere!

Theories have involved Adwords (does anyone remember the 'dictionary' concept - now past history.)

And Froogle...

A commercial filter, an OOP filter, a problem caused by mistaken duplicate content, theories based on the contents of the Directory (which is a mess), doorway pages (my fault mainly!) etc. etc.

Leading to the absurd concept that you might be forced to de-optimise, in order to optimise.

Which is a form of optimisation in itself.

But early on, someone posted a reference to Occam and his razor.

Perhaps - and this might sound too simple! - Google is experiencing difficulties.

Consider this, if Google is experiencing technical difficulties regarding the sheer number of pages to be indexed, then the affected pages will be the ones with many SERPs to sort. And the pages with many SERPs to sort are likely to be commercial ones - because there is so much competition.

So the proposal is this:

There is no commercial filter, there is no Adwords filter -Google is experiencing technical difficulties in a new algo due to the sheer number of pages to be considered in certain areas. On page factors havbe suffered, and the result is Florida.

You are all welcome to shoot me down in flames - but at least it is a simple solution.



 1:28 pm on Dec 20, 2003 (gmt 0)

So to sum up the threads so far, Google is now judging pages not by the number of keywords appearing on it but by natural (human) speech patterns.

Is that right? Should be interesting if it works pages should have a wider range of subjects within them, by that they will not just feature green widgets or blue ones but the process by which the raw materials are extracted, resulting in a green widget.

Hmmmmm interesting indeed.

One thing I have noticed is the low combination of keywords to filler text on top ranking sites somewhere between 5% - 10%, which tends to suggest that I am correct about speech patterns.

being the Brave English man that I am I have just edited a page to prove my point so if it works Ill brag about on WW forever more or... Youll all hear my screams.


 1:41 pm on Dec 20, 2003 (gmt 0)

pages should have a wider range of subjects within them, by that they will not just feature green widgets or blue ones but the process by which the raw materials are extracted, resulting in a green widget

This is certainly part of it, coming under the Applied Semantics / Broad Matching / Stemming umbrella.

I don't want to confuse the issue, because it's good to pause for summary - but my problem is that I *don't* think this concept is being applied across the board. I don't think, for example, that it's being applied to non-commerce sites. The justification for this is simple: there has been nothing like the mass movement of commerce sites observed in non-commerce sites.

This, I think, is where the confusion arises - and why the 'filter' concept is so hard to shake off; whilst at the same time, so difficult to reconcile with broad matching. Sid?

(By way of justification: non-commerce sites are optimised too - especially if written by professional web designers. And my science sites have been optimised by me - why? Because I want people to find them and read them! Needless to say, they haven't budged an inch in the SERPs!)


 2:00 pm on Dec 20, 2003 (gmt 0)

It would also seem to suggest off line processing of the filter being applied, so as to make searching quicker. This would be supported by the fact that pages are not instantly appearing back in the system when re-cached after de-optimisation.

This theory would then make it obvious, they're only going to pre-render filters for the most popular search terms, generally commercial searches. This is only a suggestion though, as someone's bound to come up with a hugely popular but non commercial search term which is unaffected.


 2:02 pm on Dec 20, 2003 (gmt 0)


It would also seem to suggest off line processing of the filter being applied

Could you expand on this? I presume you mean that the effects of the 'filter' (I use the word cautiously) is built into the SERPs at the datacenter - not applied immediately after the search. (I always presumed this, but it shows how many interpretations of the word are possible - yours seems a more technical one)


 2:07 pm on Dec 20, 2003 (gmt 0)

Perhaps the filters that they're applying are fairly complex and time consuming, thus not worth doing on the fly. To get around this, they work out which terms are the most abused, or most popular, whichever is most appropriate, and set a machine off churning through the pages in it's result set for that term. If the page has enough reason to set off the fiter (e.g. over optimisation, cross linking etc.) then they simply set a flag to mark down that page when searching for that specific term.

Obviously it's just a guess, but it would be simple enough to implement, and not affect the speed of search results.


 2:13 pm on Dec 20, 2003 (gmt 0)


The problem is that we have something that 'walks like a filter, quacks like a filter, but may not in fact be a filter'*.

Do you personally believe a 'filter' or 'filters' is in place?

*Probably a pigeon :)

[edited by: superscript at 2:13 pm (utc) on Dec. 20, 2003]


 2:13 pm on Dec 20, 2003 (gmt 0)

It would also seem to suggest off line processing of the filter being applied

An offline taxonomical categorisation would look like a filter. In effect Google could be producing a directory structure for a list of search terms.

Enter blue widget and you get the blue widget directory. If Google recalculated the taxonomy once a week and stored the results for each of those major search terms it would reduce processing requirements massively.

Co-incidence - Applied Semantics have a product that does this categorisation.

Co-incidence - Applied Semantics has DMOZ taxonomy

Best wishes



 2:19 pm on Dec 20, 2003 (gmt 0)

Enter blue widget and you get the blue widget directory

An appealing concept - but why would such a directory throw up pages with only passing relevance to the category, rather than one that is bang on target? Are you suggesting the directory categories are too broad? I'd be perfectly happy to stand alongside my competitors in my old DMOZ category - we all sold blue furry widgets in the UK.


 2:25 pm on Dec 20, 2003 (gmt 0)

An appealing concept - but why would such a directory throw up pages with only passing relevance to the category

In a word CIRCA.

CIRCA + Autocategorize = Florida.


PS Based on research and a bit more than an educated guess. Interestingly I've just noticed that directory results have changed and have hopefully started a new thread on this so as not to contaminate this one.


 2:28 pm on Dec 20, 2003 (gmt 0)

From Googles own press release when they acquired AS it was touted as a means to deliver better placed adwords. A natural extension of this would be to use the AS information already gathered for adwords and apply it to preformatted data in general searches. IMHO this is the Florida we know... the first tests of this. It is not really a filter as we would have known it before Florida. I do think Google has intentions of improving results...but really time spent on why is probably not well spent effort. For sites not affected by Florida yet, wait for next update.


 2:32 pm on Dec 20, 2003 (gmt 0)


Suck on this. I'll rephrase ;) Try this:

In a nutshell:

DMOZ is gone, Google is attempting to create an automated directory of its own to replace it; G's using some kind of broad matching technology to build it; but is initially restricting its taxonomy to the words/tokens/terms it knows the most about i.e. commercial terms (due to data from Froogle/Adwords etc.)

Result: strange commercial SERPs; standard non-commercial SERPs; the illusion of a 'filter' because the technology is in its infancy and, as yet, only applied to commercial sectors.

This would finally draw the broad-matching / filter / directory / dictionary ideas together (mainly due to your insight).

Anyway, enough of Google studies, I'm off to the pub to study the barmaid...

[edited by: superscript at 3:00 pm (utc) on Dec. 20, 2003]


 2:47 pm on Dec 20, 2003 (gmt 0)

Because the technology is new, we are only seeing it where adword data was there before...I think that is it.

That it appears as a filter because it is new...I think the truth will hurt here but it is because most commercial sites are relatively shallow with depth of content as compared to the .org and .edu pages when you discount "kw1 kw2" text.

What we need is a method to extract from Google the tokens/synonyms Google associates with "kw1 kw2".


 3:05 pm on Dec 20, 2003 (gmt 0)

~keyword1 -keyword1



 3:23 pm on Dec 20, 2003 (gmt 0)


Astonishing! Google thinks my widget is a wodget! Any suggestions for the syntax for 2 or 3 KW phrases though? I can't get quote marks to work (see below).
edit: Certainly Google also appears to associate my secondary keyword with mental health - bizarre - it has stemmed the abbreviation of a physical item to a mental heath institute <(:oP>
edit II: ~KW1 ~KW2 -KW1 -KW2 is syntactically correct for a phrase, but the search results seem less revealing.


 3:45 pm on Dec 20, 2003 (gmt 0)

I wondered if anyone else had seen this. Been using it for a couple weeks now post Florida...found reference to it again by accident. Back in August or so Google released this synonym tool and it was seen as just another nerdy tool in the forums. I hope it stays up.


 3:47 pm on Dec 20, 2003 (gmt 0)

Relating to my earlier post on p17, you al seem surprised that its ecomm sites being hit and not general info sites.

Why? Who use adwords the most? A 14 year old with a hobby site or business with employees?

Hard one...


 4:07 pm on Dec 20, 2003 (gmt 0)

Hi Idoc,

By 'astonishing' - I didn't mean the synonym tool itself, but the results of applying what James_Dale suggested. And note that it has *no* effect on the Adwords presented - even though their specific terms have been excluded from the search.

[edited by: superscript at 4:12 pm (utc) on Dec. 20, 2003]


 4:11 pm on Dec 20, 2003 (gmt 0)

I don't assume that the synonym tool has an all inclusive dataset of tokens shared by googles search results tokens ... though there are probably shared tokens between the two.

I do think that Florida works alot like this tool though.


 4:35 pm on Dec 20, 2003 (gmt 0)

In MHO, It's a filter and has nothing to do with CIRCA Why?

I operate travel websites. Go to any city and search a two keyword term (cityname hotels) and you will find the top 50 results are pretty much in the same order no matter the city (most times) It's mostly the majors, exactly the opposite of what was pre-florida.

I can take just about any city that I operate in and re-keyword the page to (cityname resorts) or (cityname suites) or (cityname Inns) and make a decent ranking. Cannot do it with hotels.

No why is that?

CIRCA - No way if you subscribe to it full and broad implementation.

Under CIRCA, If I am not an authority on Hotels, than why is it ok for me to be an authority on resorts, suites or inns.

It's a filter, follow the money.

Example: same cityname hotels produced 50,000 searches last month.
same cityname resorts produced 2,500 searches last month.

It's a deliberate massaging of the results to provide the largest monetary gain to the search engine. It really is that simple and pre- determined. That's why the most affected are the commercial sites and not very many informational sites.

Of course the algo has changed to kill the backlink spammers and other means, but the filter does the final sorting out, even for the non spammers, so the monetary gains are maximized.

Fits the picture doesn't it.


 4:47 pm on Dec 20, 2003 (gmt 0)

Since it is becoming clear we are into new update territory - tis time to bring a close to the Florida discussions and move onward.

This 260 message thread spans 9 pages: < < 260 ( 1 2 3 4 5 6 7 8 [9]
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved