Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
First, love the site. It's one of my first stops every morning, afternoon, and evening.
Normally, TheRegister [theregister.co.uk] is the great because you have tech reporters that actually have a clue about most of this net stuff. However, a couple articles on Google The Reg has run the last week, haven't entirely been on the mark.
So let's back up:
Last week there was the now reported and rereported article on a Google news phenom that allowed pure company press releases [theregister.co.uk] to slip into the mainstream news. We thought it was such a good story, that we covered it ourselves [webmasterworld.com].
While we are at it let's backup a couple of months to when Google launched the news feature. At the time, much noise was made about the fact that Google News is produced by machine algo's alone. No human editors are involved in headline clippings. Obviously, there must be some editorial decisions involved in the process, or they would include WebmasterWorld headlines in the mix as well.
So when Google says press release inclusion in to the database was a bug, I completely understand it as valid explanation. Not only is it valid, it is quite common. Bug spotting is one of the prime motivations for Googlers to read here regularly. We have been here picking out Google bugs for many years. We just found another one yesterday [webmasterworld.com].
That is not to insinuate that I don't agree with you on some finer points:
Why secretive? The company refuses to publish its News Policy - and it maintains the fiction that the selection and composition of stories on its "News section" was "determined by a computer".
Derived queries, blind queries, or query free content generation is a pretty sophisticated art. So sophisticated that it has landed a paper [cs.berkeley.edu] on it at this years www12 conference which was co-authored by Googles own Sergey Brin and Monika Henzinger.
That's as true as the assertion that the selection and composition of the story you're reading now was "determined by a computer", too.
I can understand how you would view headline generation as such magic that it would look like slight of hand. Granted, there must be editorial decisions made, but even those editorial decisions are arrived at mathematically. This is the same reason why the algo interped a press release as a story. It is impossible at this point for a computer to determine the difference between a press release, and a "news story".
Google has stated several times that the news option may be experimented with [webmasterworld.com] as a revenue option - this is not news - nor is the backhanded comment about payola.
Aside from that, the entire follow up set of comments made on The Reg are without merit. I feel they are uninformed of how such a complex system such as Google works day-to-day.
In Andrews latest installment [theregister.co.uk] of his Google saga, he insists that some how Google doodled with his search results on being googlewashed.
Google updates it's full index once a month [webmasterworld.com]. During that time, it reworks part of it's algo, and puts a fresh database online that is built from the previous months full web crawl. We are currently here waiting [webmasterworld.com] with about 100,000 webmasters for the last week for this months update Cassandra [webmasterworld.com].
Inbetween those big super updates, we have what we call FreshBot. Freshie was so named after the fact that Google puts FRESH! tags next to any listing indexed within the last 72hours. FreshBot indexes newly discovered and newly updated pages almost continuously. It then adds it to the main Google index and gives the page a little boost [webmasterworld.com]. We call it the FreshBot Sweepstakes because freshbot indexed pages can jump to the top 5 of any keyword at any given time. That boost lasts about 72 hrs.
So a story that coins a phrase (googlewashed), and that dozens of others link to, should be pretty near the top.
Not until Google performs a full update and the page is actually in the full index. Inbound links to it have not be accounted for before it can rise again in the rankings.
- You run a story on GoogleWashing. It gets picked up by FreshBot.
- It gets added to the index with a bit of a boost in the algo.
- 72 hours later, that page loses its "fresh" status and drops in the rankings. The theory is that pages that are really are fresh or updated, will have a higher value to visitors. That page will not rank well again until Google performs the full update. (eg: googlewash became googlewashed in 72 hours and basically kicked itself out of the rankings)
Other pages - such as blogs - that use the term GoogleWash, are boosted because they came after your story. They are now in the middle of their own "freshbot" cycle. Many of those pages have higher PR than the Registers. Since the term is new, and there were no other pages in the index with "googlewash" in them, I would expect to see those pages that rank high right now for GoogleWash, to drop after their "Fresh Period" is over with (same as the regs story dropped).
After that, I predict that the Regs story on GoogleWashing will eventually rise back in to a top postion.
That's my interpretation of what Andrew was seeing.
I have to admire the way The Register was willing to point out what they thought was something dodgy about Google instead of the usual puff-pieces about Google-and-the-chef-and-the-food at the GooglePlex.
You put in that way, I agree with you. I hardly ever find myself arguing about a news report. I tend to argue with people reactions to news reports. If I came across the articles in question by chance, I probably would have dismissed them, or been more willing to forgive the inaccuracies in it as a result of the specialist nature of Google knowledge.
I actually [i]prefer[i] Google to list press releases and blogs. Why do I need journalists? I feel able to read the original (as is possible) source of an article and make up my own mind. Try using Google News to find as much information from as wide a variety of sources as you can about a big story. When I think of Google, I think of search, and I use Google News to search news ;)
The ordering of articles on the frontpage is completely inconsequential to me.
"The victors invariably write the history to their own advantage."
Jean-Luc Picard, Star Trek, The Next Generation, Contagion
The concept is a little more subtle than that. Calling it propaganda implies that the victors intentionally want to distort the truth. However, the victors are usually victims of their own propaganda, and thus actually believe what they were doing was just. Put another way, the victors tend not to be objective reporters.
Me? I prefer alltheweb to google. At least I know what the hell's going on and all my sites get crawled instantly and stay in their index.
At the end of the day google are a private company with no real opposition to keep them in their place.
They need opposition, and if oppositon makes google a better service so be it.
Google need serious competition desparately.
Let's admit it, google was only fun when it was "geeky"
and saying to people "I use google" and people
didn't know what you were talking about.
Now, google is just a pain frankly.
hehehehehe, I can assure you they take their business very seriously. They have a USP/Marketing Pitch and they work it very well.
Mr Orlowski seems to be set in his views, that's a shame when a little simple research could have helped him. Whenever I read such articles I just think of the "boy who cried wolf" story and for those of us that are interested in holding the SE's to account that is counter productive.
A key exhibit in his case is the alleged "Googlewashing", or demotion down the Google results ranks, of one of his own stories. That story had - in turn - accused Google's search results of being heavily influenced by a tiny cabal of "big name" webloggers.
It was an intriguing claim. But it was quite quickly undermined by some search engine experts, who called into question Orlowski's understanding of Google's admittedly complex technology, which works out which pages should be ranked highest in search results.