Forum Moderators: open

Message Too Old, No Replies

Update Florida - Nov 2003 Google Update Part 4

         

Kackle

5:57 am on Nov 22, 2003 (gmt 0)



Continued from: [webmasterworld.com...]

Kackle - can you explain the "dictionary" for me? And how I might benefit from it - Im reading your posts hard but dont see where youre coming from.

Sure. But you have to act quickly. Google will fix this one just like they fixed the hyphen.

1. Google is depreciating pages/sites that are over-optimized for certain keywords or keyword combinations. It does this by looking up search terms in a dictionary of target keywords or keyword pairs that it has compiled. This dictionary is Top Secret, because if you knew what was in the dictionary, you could avoid these words in your optimization efforts.

2. If the search term or terms hit on a dictionary entry, the search results for that user's search are flagged. This means that before the results are delivered, the order of the links, or even the inclusion of links, are adjusted so as to penalize pages that have overoptimizated for those terms. Most likely the title, headlines, links and anchor text are examined. It's possible that external anchor text pointing to that page has also been pre-collected and is available for scanning, but this is much less likely. (Besides, external links are not something within your immediate control, so don't worry about it right now.)

3. You want to find out which keywords that are relevant to your site are in Google's dictionary. Compile as many relevant keywords you can think of that searchers might use to find your site. Now take these words singly and in pairs, according to how users might search. Run two searches for each combination and compare the results.

4. If the results are strikingly different for the pre-filter and the post-filter search on a particular term or combination of terms, it means that some variation of those terms has been flagged because something was found in Google's dictionary.

5. Do lots of searches and you can come up with a list of "sensitive" words that you'll want to avoid when you re-optimize your pages.

It's a nice weekend project.

Kackle

10:08 pm on Nov 23, 2003 (gmt 0)



Could you point us to some of this evidence please. Like everyone who is affected I want to know what to do next. To test the idea of a dictionary I wonder if others here could share, what evidence they have found that all/some of this recent Google effect can't be explained by on page text and link back keywords?

There has yet to be a report that noncommercial sites are affected by this filter. And there has yet to be a report that non-English language sites are affected by this filter.

Both of these suggest that there is some initial threshold. You don't have to call it a dictionary. In fact, I'm not at all clear as to whether it's a dictionary of single words, or a dictionary of word pairs, or some combination of these. I don't think it's a dictionary of three-word combinations -- this would get too complex because there are too many possible combinations. However, there may be an initial parsing algorithm so that longer search terms get broken up and two passes are made at the dictionary.

"Dictionary" is just a euphemism that describes the initial decision about whether to apply a filter to the searcher's results or not. Make up a new word if you don't like the word "dictionary." You'll still have to explain the evidence by coming up with some sort of concept to explain the initial decision.

But in fact, lookups in dictionaries are extremely fast for computers. With a dictionary of only 20,000 English keywords, I can do an amazing amount of screening at a speed that boggles the mind, using B-tree lookups. I can keep the entire dictionary in less than half a meg of memory, which is peanuts these days. I've programmed with custom dictionaries, and it's just amazing how few words are really needed to catch 99 percent of what you want to catch.

There are always linguistics experts and artificial intelligence people who insist that Google would never do anything as crass as a dictionary lookup. They're locked into looking at things from the perspective of a particular profession. No engineer has ever gotten a promotion for coming up with the idea of a dictionary. It's too stupidly mundane.

Yet you have to wonder whether some sort of dictionary wouldn't be adequate for a filtering task such as the one we're seeing. And certainly, since the algo is implemented on the fly, Google would be very interested in CPU overhead. If you read their early papers, Google engineers have always been very conscious of CPU overhead and storage space -- down to the point where you save a single CPU cycle or save a single bit in a bit-masked byte of storage.

You don't scale the way Google has managed to scale over the years by doing things the most complicated way possible. PageRank was an exception -- it required too much overhead. I think that's the main reason it is on its way out. Besides, the early engineers couldn't criticize PageRank as being unnecessarily complex. In the early days, I'm sure no one dared tell Sergey and Larry that there were easier ways to accomplish the same thing without spending several days a month computing Pagerank.

dazzlindonna

10:10 pm on Nov 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



powdork,

i cant even begin to imagine why either brett or gg would say that we would need a 301 redirect from www.somedomain.com to somedomain.com to avoid a duplicate content penalty. i could believe the sky is falling (literally) before i could believe that one. <shaking my head in utter disbelief>

deanril

10:13 pm on Nov 23, 2003 (gmt 0)

10+ Year Member




This is getting old, reason I keep looking in this thread is to see if some changes are happening on the other 10 datacenters.

All your conspiracies and concoctions are all null and void.

This is not the index, this is between beginning and end of this update.

Only thing I can see is it helps release some of people’s anger towards the current serps, other then that this is useless.

I too watch what people with over 1000 posts say, because the rest of this is pure Looney tunes.

Crisco

10:15 pm on Nov 23, 2003 (gmt 0)



Myself and fellow colleagues operate in excess of 10,000 websites. We are currently in the process of implementing a redirect page/popup for all incoming hits from google.

Im certain you will see them very soon, if not from one of our properties them something similar from elsewhere!

Crisco

10:19 pm on Nov 23, 2003 (gmt 0)



I too watch what people with over 1000 posts say, because the rest of this is pure Looney tunes.

Too easy for google "moles" to jump in and say how much they love the current results. IMHO # of post alone doesnt mean increased credibility.

As for being over - its 11.23.03 - Its OVER two day!

allanp73

10:21 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



I made the sudden realization that Google is working prefectly. Best ever results in fact. The problem we are experience is that we don't know how to use the new Google. The double minus serps are excellent. All we have to do is to explain to clients and surfers alike how to use this new Google. Either that or we tell them about the really great new search engines called Altavista or Inktomi.

europeforvisitors

10:31 pm on Nov 23, 2003 (gmt 0)



There has yet to be a report that noncommercial sites are affected by this filter.

My editorial content site has been affected. More specifically, several index.htm pages for my "sites within a site" have disappeared from their usual positions (#1, in one case) in the last 36 hours or so.

I posted a hypothesis on why this happened in the "How many front pages did you lose?" thread:

[webmasterworld.com...]

I seem to recall that this may have happened before, quite a while ago, and it got corrected in fairly short order. (And it isn't a disaster, since most of my Google referrals arrive on inside pages anyway).

nutsandbolts

10:31 pm on Nov 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



301 redirects - That's always been a issue for some sites, but that really doesn't account for my missing index pages and other Web sites I've looked at. It's an issue, yes. But there is FAR more going on behind the curtain than a simple 301 redirect problem.

There hasn't been any recent communication from GG about the current situation, unlike the last big update, when he said more data will be folded in over time - "less than months, more than weeks". Now that is worrying, because we all twiddled our thumbs and waited till they popped back last time. Now? Who knows. According to GG, the data is in there and as every update goes - some sites drop, some gain.

Why would sites MIA suddenly appear just by adding -blahblah after the keywords? Obviously some sort of Scroogle filter is in place.

Has anyone tried stripping out those "poison" keywords from the Title and Body of the index page? - Did it make any difference?

MelanieFL

10:33 pm on Nov 23, 2003 (gmt 0)



I work a lot with real estate agents. Look at this SERP "naples florida real estate". All the real estate companies are gone and are replaced by directories. I do similar searches like "boston real estate" "boca raton real estate" "miami real estate"...again...mostly directories. Digital City seems to be showing up fine in all of those SERPS...weird.

Well at least I can tell my clients that their competitors are gone too LOL.

merlin30

10:39 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



EFV,

I think Kackle's point more precisely is non-commercial search phrases as opposed to non-commercial sites. So the question is - are your index pages missing for search phrases that have commercial connotations?

deanril

10:45 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Too easy for google "moles" to jump in and say how much they love the current results. IMHO # of post alone doesnt mean increased credibility.

As for being over - its 11.23.03 - Its OVER two day!

What does 11-23-03 have to do with anything?

Yeah # of post do mean a lot of things, experienced members, you'll notice this board has a lot of experienced members but how come they are not posting in here?

Because they know....

needhelp

10:47 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



not sure if this helps, but I know of a site that kept its #1 spot, while all other non-directory sites dropped. Also, another site that remained in top 10 while others dropped in place of directories/target/bizrate/etc. if anyone wants to do analysis, sticky mail me. i'd do it, but i don't have enough know-how to do it properly. how did these sites weather the storm?

flicker

10:47 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



>There has yet to be a report that noncommercial sites are affected by this filter.

That's not exactly true. Searches on non-commercial, non-competitive sites HAVE been affected; the result has just been less dramatic and less negative. The Florida update has caused a shift in rankings and a vanishing of irrelevant porn spam from educational searches I've looked at. I still say the effects are just less disastrous in less competitive searches because there aren't thousands of sites jockeying for position, so a change in the algorithm (or a bug for that matter) won't vault 250 sites over a previously high-ranking one, just two or three. Which, obviously, is an acceptable vagary of fate.

I wonder if this is just a simple miscalculation on Google's part--that is, if they rolled this update out with the successfully improved non-commercial results in mind, not quite realizing the chaos it would inflict on the 5% of most competitive searchterms.

Kackle

10:48 pm on Nov 23, 2003 (gmt 0)



My editorial content site has been affected. More specifically, several index.htm pages for my "sites within a site" have disappeared from their usual positions (#1, in one case) in the last 36 hours or so.

I did see one site that was a *.co.uk that got zapped for optimizing on the term "law essays." But both words are, perhaps independently, somewhat commercial terms. One because of lawyers who advertise, and the other because of college students looking to buy their homework. There's going to be a gray area between commercial and noncommercial. I still think it's generally true that this filter is not aimed at noncommercial sites.

Small Website Guy

10:49 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



There are some voices who say that we just need to wait for Google to finish the update. They are clearly wrong, at least about the waiting part.

There is a clear intent here to FILTER out pages based on the search term used. The same page could appear at the top of the SERPs under a different search term.

The filter definitely has something to do with the page being over-optimised. I thank people for the -fufuf trick, it helps to explain a lot.

It doesn't matter if it's a one word search term, or a four word phrase, the filter still works. I know this because I have a page being filtered out for a four word phrase, and at least two pages that I know of being filtered out for just a single word.

I have no idea if it's only "commercial" words being filtered, but if Google wanted a list of commercial words and phrases, the obvious source for that list is the Adwords database. Just add up all the money being bid for a particular word or phrase, and then you know how commercial it is. I don't see why Google would program some complicated algorithm that might now work in all circumstances when a very simple query on the Adwords database could figure it out with perfect accuracy.

To the extent that commercial sites are filtered out, to be replaced by sites that have nothing to do with the keywords in question, Google will undoubtedly increase its Adwords revenue.

The danger of "unoptimizing" the site to get back into the free Google results is that you could lose your ranking in Yahoo and MSN and still not get back into Google. You'll be throwing out the baby with the bathwater.

gibbon

10:51 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



>>This is not the index, this is between beginning and end of this update.

deanril, your comments makes me think of the words Churchill spoke during the darkest days of WW2

"This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning"

Please note that I am not associating Google with Nazi Fascist Tyranny and abuse of power in any way :)

Johnny Foreigner

10:52 pm on Nov 23, 2003 (gmt 0)



Exactly our thoughts, its a case of Google or all the others? If you go the Google only route you will be tied down more than you are already.

deanril

11:01 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Who says this is it?

This is the way the index will stay? Who?

Wheres your proof?

I have proof that this is not the way the index will stay. Look at the last 4 major updates(1.5 year+).

Everyone needs to chill.....These searches suck, they are not the final index.

As for "Money Words" you will notice that with most of the bad searches(sites missing) there is a Google/Dmoz category at the top of the first page. Most of the unchanged still ranking good pages have no Google/Dmoz category on top.

These pages are commonly updated, where as the pages with out a Google/Dmoz are not commonly updated(once a month typically). As far as I can tell Adwords has nothing to do with it, the page without the Google/Dmoz listing that are uneffective have plenty of Adwords aswell.

ronhollin

11:01 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Shouldn't the Internet be free? Even for commercial sites? I think so. It started out that way and it should have stayed that way. Everyone knows that the Government is actively trying to tax emails. How would that affect people sending emails? How does this update/screw-up affect the "hardcore" Google users? Do they go to another SE or do they spend more of their time trying to find relevant websites?

pgkooijman

11:01 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Where did all the Amazon.com listings come from, suddenly they are everywhere? This is really madness: how is a link to purchasing a book at Amazon better for people searching on 'lord of the rings'than a website with more than 500 pages of lord of the rings content? If Amazon.com is not paying Google for this then I do not understand how that just happened. They are at place 3!

Stefan

11:04 pm on Nov 23, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It looks over to me, probably yesterday.

Our site was one that wasn't affected... it stayed the same, #1 on our important kw's and actually improved on certain more competitive ones. I've been watching googlebot closely in the logs through this. During the "update", while normally we have about 15-25% of our ~150 pages hit by googlebot per day, it dropped to only the index and 1 - 3 pages. Yesterday it kicked back up... we had about 40 pages taken by the bot. Freshtags are appearing on our main pages updated daily, Nov 23 right now.

Maybe they'll fine-tune things, as they always do on an irregular schedule post dom/esm, but we're likely back to the rolling update now. The big tweak is over.

Small Website Guy

11:12 pm on Nov 23, 2003 (gmt 0)

10+ Year Member




Shouldn't the Internet be free? Even for commercial sites? I think so. It started out that way and it should have stayed that way. Everyone knows that the Government is actively trying to tax emails. How would that affect people sending emails?

An ironic statement considering the the purpose the majority in this forum have for getting to the top of the Google rankings is so that they can make money. The internet ceased being about anything but money a long time ago.

Regarding the taxing of Emails, it's an urban myth, but I think it would be a good idea because it would get rid of all the spam. If someone doesn't want to pay 5 cents to reach me, I don't wan't to read his Email.

ronhollin

11:18 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



That's what Hotmail is for

Dave35London

11:23 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



When I search for my targeted term which is extremely competitive the results are fluctuacting madly and different across data centers. The fuss revol;ves around these terms and google hasn't got it straight yet. In many important respect these are not final results.

Dave35London

11:25 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Radically different and I mean all top ten results totally different on my 500-800 visitor per day search term on google.com in the last two minutes.

How can the "big tweak" be over.

Dave35London

11:28 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



Many sites are unaffected, some of mine have been at 1000 visitors per day plus and stayed there. But what matters is what's going on with the affected sites and search terms. On this big money terms everything is nowhere near settled and google is in classic dance mode still. You're just making yourself look ignorant if you say otherwise.

Kackle

11:30 pm on Nov 23, 2003 (gmt 0)



I have no idea if it's only "commercial" words being filtered, but if Google wanted a list of commercial words and phrases, the obvious source for that list is the Adwords database. Just add up all the money being bid for a particular word or phrase, and then you know how commercial it is. I don't see why Google would program some complicated algorithm that might now work in all circumstances when a very simple query on the Adwords database could figure it out with perfect accuracy.

Some of us feel this in their gut, but few say it. Google will never admit to it. There are possible legal implications if Google were to admit this. There will be people lining up to denounce you, and you'll never know if they are "sent men."

Think of it this way: by saying it's a dictionary of commercial terms and that Google is cleaning up over-optimized spam in the organic listings, Google remains "cool." As the kiddies say, "Google rocks."

By saying that Google is surreptitiously forcing e-comnmerce to pay for listings, you are accusing Google of having already become the next Microsoft.

We are past the "buggy update" stage. Unless the entire Googleplex has been getting fat on gourmet lunches and working it off on the golf course, for the entire last week, this thing would have been turned off by now. The "hyphen fix" that they pulled off within a few days of its discovery means that they could also turn off this entire filter just as quickly -- if they wanted to. It has to be deliberate. GoogleGuy speaks volumes by his absence.

If you concede that it's deliberate, and you still want to believe that Google is merely fighting spam, then consider:

Even if the dictionary was compiled entirely separately from the list of top Adword terms, what sort of overlap would you expect? 80 percent? 90 percent?

Does it matter to Google which list they use? Yes, in front of a judge or jury the question of motivation may matter a lot. Does it matter outside of a courtroom? No, apart from the need to keep unthinking journalists from asking too many questions, it doesn't matter much at all.

In the end, the consequences of using either list are approximately the same.

dreeve

11:30 pm on Nov 23, 2003 (gmt 0)



Hi Otnot
<I have noticed that I can search with words from my text that when strung together have nothing in common but will pull up my site #1>

I think you just hit the nail on the head.

I just experimented and found that it works, I lost my index page after being number 1 for a few months. I will now remove my keywords from titles and h tags and concentrate on the content.

Looks like google is looking more at content than anything else.

jbage007

11:33 pm on Nov 23, 2003 (gmt 0)

10+ Year Member



During the recent google shakeup, our site which was #1 in its category for MONTHS (almost a year) has been zapped. It's gone. Can't find it anywhere in the index. In fact, we had TWO sites on the top 10 page and they are BOTH gone.
But guess who else is also gone? All of our business related competitors. Everyone of them. Gone. Zapped.
Replaced by what?

Across the board, it seems to be Dot Gov and also Dot Org sites. Information related sites that searchers are NOT looking for (they want to BUY the insurance related products that the business oriented sites are actually SELLING).

We've been gone for about 5, maybe 6 days now. I'm beginning to think this "new and improved" algorhythym thing is a major disaster.

Maybe we should start a Dot Org site?

Anyone else seeing same or similar results?

Anyone have any hope that google will come to its senses?

If I didn't know better, I'd swear google was trying to "force" business related sites to cough it up and start paying to play! ArG!

superscript

11:33 pm on Nov 23, 2003 (gmt 0)



Small Website Guy

You are ignoring the fact that the Internet is now used by millions of people to buy things - as well as to find information.

If I type in, for example, 'cepheid variable stars' - then I expect, and should receive an informational site about astrophysics.

But if I type in - I don't know - 'U bend for toilet', I expect to see plumbing sites that specialise in this. Not a pdf. for a tribute site to a band called 'U bend', followed by 2 pages of sites about camper vans that feature a toilet.

Pretty poor hypothetical examples, I admit - but to deliberately favour 'de-optimisation' is so illogical and so crazy, that if Google has done this, it has finally lost it.

Unfortunately, despite being initially unconvinced, I now think this is what Google has done. The addition of a nonsense -anyolddrivel$ to a search term, which then shows up sensible results is potent evidence of this.

This discovery is an indication of the quality of the minds on this forum, and I am sure Google will have been surprised, even shocked that this has been noticed.

What we are seeing is a commercial filter. But only commercial sites are likely to spam, so it can easily be explained away as a spam filter. But the damage caused could be immense. My business is certainly in danger of folding.

This 626 message thread spans 21 pages: 626