Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Bourbon and Google's Recent Patent

         

Tom_Arah

11:34 pm on Jun 9, 2005 (gmt 0)

10+ Year Member



Like many others my site was hit hard in April and disappeared in late May (losing over >150,000 visitors a month). The various Bourbon threads have been very useful but so far mainly for ruling out any individual shared parameter apart from the presence of Adsense (which you’d assume wasn’t in Google’s interest). This seems to suggest that something new is at work.

Surely the most obvious candidate must be Google’s patent application 0050071741 which was passed on March 30th and which represents some very rare hard(ish) facts direct from Google about how it plans to improve its search quality. With 60 interacting parameters based primarily on historical document and link analysis it’s difficult to pull out what the effects would be – especially as these would vary across different sectors with, for example, old content sometimes seen as stale and sometimes definitive - but the most important point is that there’s no way that such new factors could be introduced to the mix without resulting in major winners and losers – just what we’ve seen recently. For my software reviews/articles site I can certainly see how links from years ago could now be heavily discounted (though I’d argue the content is still very useful)

Having said that, while I can understand dropping due to the changes, I’ve definitely been blacklisted/sandboxed which presumably means that Google has decided I - and everyone else who’s been affected so severely - is up to blackhat tricks which is definitely not the case. With the patent’s focus on link analysis that brings me back to the idea of too many recent links triggering a spam filter (perhaps with the use of AdSense being seen as a secondary spam indicator as the monitoring of ads is also mentioned in the patent). It’s certainly a possible explanation for what’s happened to my site as Google reports 3500 links many of which I would guess are from recent scraper sites picking up from directory listings and particularly Google’s own SERPS (Google’s previous patent “based on the interconnectivity of the documents in the set” would presumably explain why Google is so susceptible to scrapers and as my site was popular across a broad range of software-based keywords it was a natural scraper target).

I can see the benefit for google and the majority of searchers in clearing out the spam but, if this is what has happened, it means that honest and useful sites are being penalized simply for being popular on Google. And possibly for signing up for Adsense too!

Of course anti-spam false positives (whether as described above or not) are inevitable, but there needs to be some workable appeals procedure for removing undeserved blacklisting based on manual checking. OK it’s not algorithmic/scalable but with $50 billion in the bank I think something could and should be done for those who’ve lost out – the phrase “Don’t Be Evil” springs to mind

And as this forum seems to be the centre of Bourbon discussion/disgruntlement and I’m sure folk at Google are monitoring it, can we not do something about it? For example is there somewhere we can post our actual website addresses in the hope that they’ll fast-track us back into the SERPs if only to stop us moaning?

This is my first posting - sorry it’s so long.

reseller

2:21 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tom_Arah

Thanks for interesting post.

<I can see the benefit for google and the majority of searchers in clearing out the spam but, if this is what has happened, it means that honest and useful sites are being penalized simply for being popular on Google. And possibly for signing up for Adsense too!<

Actually we have read on several threads fellow members mentioning sever drop in their sites ranking on the serps or the disappearance of their sites from the index. It was also mentioned that the said sites had high ranking on the serps for their relevant competitive keyphrases. Those sites are/were not spam sites or scrapers. So what's the reason behind their "sufferings".

Through discussion in "Dealing with the consequences of Bourbon Update" thread, several reasons were mentioned for sites being penalized and disappear. Most important is the 301 redirect (www.yoursite.com vs. yoursite.com)

[webmasterworld.com...]

Guess and assumptions are all what we can do. However I don't think that AdSense is a reason for penalizing a site. Because it has been reported by fellow members that there are still "AdSense Scrapers" on top of the serps.

walkman

3:50 pm on Jun 27, 2005 (gmt 0)



I've been saying for a while (even though my site finally came back) that Google has a cast a too of a wide net trying to clean spam. Many innocent sites have been caught in it.

theBear

4:36 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tom_Arah

There is a site that links to you that appears to have uncovered the Google cache entry for one of your pages.

That cache entry is findable in an allinurl:domain search.

In addtion you have some 72 pages duplicated between the www form and the non www form of your domain.

Please compare your sites tanking with MikeNoLastName sites tank.

joeking

7:37 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



Great first post Tom!

And may I say you don't look a day older on your website from when you first set up your design business way back when in Auld Reekie!

Hope you're well and busy.

Joe King

sailorjwd

9:10 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tom,

Greetings.

My site parallels your to a large degree.

Same industry
Same # of visitors lost coincidently
Wide number of topics within the industry
Incredible good rankings (probably too high for me)
My decline began on Feb 2nd and finished May 5th.

I havent come back yet except for a handful of phrases.

Had many dup pages via content theft - whole pages, whole multi-page groups, etc.

Had a possible issue with spam-like internal linking with 20-30 keyword-rich links on most pages on left and right nav area. Result was pages with VERY high keyword density if you count link tags.

Interesting footnote. One 3 word search has near zero keyword density for one of the words but word is in 25 link tags pointing to page. This search ranks #1 - even beats mr softy.

I'm in the process of killing the on page keyword density for the link tag keywords and I'm back to position 30-50 on many searches now. (still more work to do)

Joe

MikeNoLastName

9:52 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This cache issue looks interesting. I tried to start a new thread on it, but It's in moderator review limbo for who knows how long.

It's easy to check for just do:
"allinurl:yourdomain.com cache"

It will show up something like:

[66.102.7.104...]

Always with an IP address.

You can also view the 60,000+ in the database by typing "allinurl: search?q=cache"

Clicking on it will USUALLY show a googlecache of your pages which is oviously duplicated content. It SHOWS a PR of 0 but then it IS a Google domain, so who becomes the authority....?

It would be interesting to hear from if anyone else who has been dumped is seeing this or not. Also if anyone who has NOT been dumped has one just to determine if it could be a factor.

joeking

10:14 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



Quick thought - are your reviews , etc., unique to your site Tom - you say they are mainly culled from magazines (I know you wrote them, I don't mean to imply you are stealing content!).

Do these magazines also run the same articles? If so a duplicate content problem could be part of the problem.

Tom_Arah

11:10 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



Thanks to everyone for picking up on this thread - I thought it had disappeared without comment.

In response to Reseller I picked up on the canonical index issue thanks to the forum and put in the necessary 301 redirect but I can't believe it is the cause of such a massive drop.

Also I definitely get the feeling that things are being done on a keyword/sector basis - I know it's not just sailorjwd who is in a similar boat to me - which would explain reports of Adsense scrapers surviving. I haven't comprehensively looked into it but FWIW I think technology review searches are better than they were. I certainly hope the underlying reason is tackling spam, there's no other justification for putting us through this.

Thanks to the Bear for pointing out the probable hijacks (and the non www issue). These are largely inadvertent redirect links from very respectable sites ie about.com and creativepro.com and have been around for years so I can't believe they are suddenly draining pagerank/traffic on such a scale.

>>Please compare your sites tanking with MikeNoLastName sites tank.

How can I do this - I don't see any URLs in member profiles?

To sailorjwd commiserations and I've wondered myself whether my internal links might count against me - 250 keyword-heavy review links in my archive section could look spammy. Don't want to change it though as it's very useful to visitors.

To MikeNoLastName I did a check and one page did appear as you describe. Again, as with regular hijacking, I can't believe that this alone would suddenly explain a 75% drop (not that that means it doesn't)

And to Joe King - it is possible that I'm suffering duplicate penalty from the magazine sites (basically we both have copyright and both post) though in the past these were data-driven and never figured on Google.

And you're right it's time I updated my photo :)

Tom_Arah

11:31 pm on Jun 27, 2005 (gmt 0)

10+ Year Member



Again thanks for everyone's comments but they still don't pick up on my main point which is: why is no-one talking about Google's recent patent.

This makes it clear that Google is going to take historical document and link information into account when it works out rankings which changes the landscape entirely - especially for us old time content-focused publishers doing reviews etc who seem to have been hit hard.

Looking at my referral site entry pages and search engine phrases in my stats it certainly seems a reasonable explanation of what's happened with older pages and their keywords doing less well than more recent pages.

More to the point, on reflection it seems a justifiable explanation as, all things being equal, most searchers will prefer recent reviews and tutorials and content generally. Obviously this is not always the case - some of my articles now have minimal traffic but are still the last word on their subject :) - but on balance I think Google is right to try and take time and historical factors into account. Especially as the stress on organic content and link growth should help kill spam sites.

In retrospect, despite the topic's title, I don't think the patent has much to do with Bourbon which seems to have been a short-term blanket near-100% penalty that has largely been removed since I made the original post (maybe an anti-spam measure). However I still need to explain my earlier 75% drop and the patent seems to me to be the most likely suspect.

If those who have been similarly affected look at your current referral entry pages and search engine phrases compared to the good old days, do you see a similar pattern emerge?

theBear

11:43 pm on Jun 27, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Tom,

I think that MikeNoLastName has weighed in on the thread.

Happy to point out that which I find.

When did you add your 301s, but what is more important (are you listening Clint) is what if any actions did you take to clean up the mess (g1smd has been through this parts as have I). I haven't seen much that actually helps in doing this.

johnhh

12:06 am on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tom,

"If those who have been similarly affected look at your current referral entry pages and search engine phrases compared to the good old days, do you see a similar pattern emerge?

As a member of the 75% down club...

On age of page it seems to have no effect for us. At subdirectory ( topic) level the same topics are popular as before - just the traffic is drastically reduced.

We have highly similar pages in terms of navigation/links in and out/design - just different content some rank #1 some #100+

On search phrases the difference is a bit more marked - but this reflects the position of the pages rather than the topics themselves. So pages on red widgets may be more popular than blue widgets - but widgets as a topic stays constant.

The patent may be a factor in this - I haven't read it- in relation to "links in" age. But patents are often generalised to "catch-all" possibilities.

As an experiment, and to prove demand still exists, our new adwords campaign is getting a very high Click through %.

P.S why is always widgets - when I did accountancy exams it was always XYZ company makes blue widgets make up a balance sheet - does anyone actually sell widgets?

helleborine

12:11 am on Jun 28, 2005 (gmt 0)

10+ Year Member



Ah! The Patent.

I'm highly skeptical about the patent. I don't trust it.

It might simply be that Google has found ways to apply the ideas within the patent application, and they want to make sure that the competition is kept at arm's length by patent protection for the next 100 years. As an added bonus, webmasters are now in awe AND confusion about Google's all-knowing powers.

Many of these ideas might even worsen the SERPs. Google can't apply them without surveying and thourough evaluation.

All those ideas will require boffo computing power. Maybe one day. But now?

I bet many of them are way, way down the road, if at all.

2by4

12:53 am on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tom, lots of people here have talked about google's recent patent [webmasterworld.com].

Or just run this search of WebmasterWorld threads [google.com].

It's an interesting question though, sort of. But bourbon kind of caught people's attention in the last 2 months, since it affected them directly.

reseller

5:50 am on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tom!

"If those who have been similarly affected look at your current referral entry pages and search engine phrases compared to the good old days, do you see a similar pattern emerge?

My site has lost 75% of its Google´s referrals on 3rd Feb 2005 (Allegra). So I might be considered the co-founder of the 75% club ;-)

As to entry pages and search phrases, I can say that its around 40% similar to those of the "good old days".

Tom_Arah

10:26 am on Jun 28, 2005 (gmt 0)

10+ Year Member



Thanks to all for the latest comments.

2by4 I did see the patent topic, that's what alerted me to it, but bizarrely all comment on it seemed to fizzle out just before we were hit with the major algorithm and traffic changes that are exactly what you'd expect if it was implemented.

helleborine I agree that the patent can't have been implemented in full and across the board as the resulting index would bear absolutely no correlation to the old one and everyone would be up in arms. However there's nothing to stop them cherry picking features and implementing them in certain keyword sectors. In my area of computer software it would certainly make sense to assume that if someone searches for "xyz review" or "xyz tutorial" they are more likely to want to see one posted in the last year or so (though of course not always).

Not sure what field johnhh is in but maybe it's not so time-sensitive as mine and reseller's.

I also think that it would make sense to screen out the huge recent rise in junk links from scraper directories that have probably been over-inflating the ranking/traffic of sites that previously did well in SERPs and dmoz style directories which would create a general fall in traffic to all pages which I'm certainly seeing as well.

And yes many of the patent's ideas would falsely lower the rankings for particular pages eg some of my articles aren't stale they are definitive :), but so long as the overall effect was positive, Google would be right to implement them.

When we're looking for new factors to account for across the board (or sector) changes it seems perverse to ignore Google's stated aims that it plans to implement such changes based on historical data just because it's difficult to see exactly how they would do it or how it would work in practice. Especially as you are right that this would be a huge appeal to Google in itself. And should be to legitimate publishers too if it succeeds in shaking off spam sites that are currently manipulating the existing PageRank system.

Basically I think the signal to noise ratio has become too much for the current PageRank citation approach and they need to use historical data to screen out the junk to find the organically added human content and especially backlinks that PageRank needs to work its magic.

If that's what's happening then I can at least understand the reasons for my drastic fall, or rather the underlying purpose, which makes me feel a bit better.

And just following up on johnhhh's observation regarding his high click through rate on AdWords, I think another very important factor that hasn't been discussed AFAIK are AdSense clickthru and eCPM rates. Mine have risen dramatically recently and during June I'd say are more than double what they were in "the good old days". It's a welcome softening of the financial blow but, more importantly, it seems to suggest that Google is succeeding in providing more targeted traffic. Anyone else noticing a similar effect?

johnhh

10:59 am on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Tom_Arah: "click through rate on AdWords"

Just like to add this was from "search" only not "content" implying users can't see what they want and go for the ads.

IMHO any patent regarding software has to be suspect on a prior existance basis. Perhaps I could have one on "produce listings of extracts of existing pages on the internet in response to a user entered enquiry" rather like Amazon's "one click to purchase" one.

I don't think all the patent may have be introduced in Bourbon due to results seen on my topic area.

However I do understand the massive problems involved in applying multi-factor filters -i.e its really easy to get tit wrong or get the right results in one area the wrong results in another.

helleborine

12:36 pm on Jun 28, 2005 (gmt 0)

10+ Year Member



Of course, only a few, if any, parts of this patent application might have been implemented in Bourbon.

I a have sinking feeling that whatever they applied was deceptively simple. We just don't have a handle on it.

Marval

1:18 pm on Jun 28, 2005 (gmt 0)

10+ Year Member



Tom - I can tell you from experience that the splitting of the site (the www vs no-www and any vanity domains) can definitely drop the site no matter how old it is - I started experiencing it back in September - figured it out in November (my site was split into 5 domains and between the www and non-www) - fixed it in December and am finally back in with smaller pieces appearing in February and finally with Bourbon - completely back. The only changes I made were to get the 301s up and delete all content/change relative URLs to static - no other changes whatsoever. I was going to attempt the dupe URL removals, but decided to wait it out and seems the changes I made were the ticket.

Added - this is a site that has been around for years with plenty of backlinks, no underhanded type stuff, and has always ranked extremely well on very competitive words and phrases.

[edited by: Marval at 1:20 pm (utc) on June 28, 2005]

Tom_Arah

1:19 pm on Jun 28, 2005 (gmt 0)

10+ Year Member



In retrospect I think Bourbon is another issue. There can be no improve-the-SERPs based justification for 99% bans/penalties as they patently and inherently lower search quality. Unless it was a deliberate filter to take out as many spam sites as possible and then some work was done to restore the legitimate false positives - and if so they should have done it on one datacenter and saved us all a whole load of grief.

And I think you're right they would only need to implement a couple of simple features from the patent to have a major effect. As I said I think that discounting the recent scraper link bonanza and valuing pages that weren't created years ago more highly for technology-based keywords would go a long way to explain my general fall and its particular pattern. And if that's what was done and searchers were happier with the SERPs as a result then I'd just learn to accept it.

This is where the lack of any information from Google is criminal. If they are tweaking the algorithm on historical grounds or whatever they should say so. They should be able to justify what they are doing or not do it. The fact that we are all hanging on googleguy's postings for the slightest insight into what the googleplex is thinking is ridiculous from what is now the world's biggest media company. Especially when what they are selling is ultimately our media!

theBear

1:32 pm on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



While discussing the patent implications of bourbon, one should also consider the site EFV has, the sector it is in, and the fact that several others here are active in the same sector.

This sector is one that could be classified is evergreen in that the content is never really out of date.

His site got hit in March and has since recovered.

Our site also in a sector that should be considered evergreen is on its third trip to the 75% club (actually worse than that). We thought that we had our duplicate content problem cleaned up when we watched a pile of stuff reappear on our ip address for the site we had just cleaned up.

We have a number of questions into Google at the moment, doubt if we will get much more than a form email back. But we did provide plenty of information using their own search engine to point out errors in their system.

Drawing conclusions about SERPs from Adwords is interesting, as is drawing conclusions about a site running Adsense is in danger of getting depressed by Google.

A more likely play would be that some folks are targeting keywords to produce income and are using known holes in the system to knock out the sites with high SERP placements if possible.

This would explain why sites that don't have adsense also would get hit as well.

We had a Google cache of one of our site pages exposed as did MikeNoLastName and as did Tom and some 67,097 other pages. Would these be duplicate content problems who knows.

In short it could be just about anything, including the implementation of parts of the patent to simple errors in the system to folks taking advantage of holes in the system.

Google has stated that the reason for Bourbon was to implement signals of quality.

Error free large software systems are something that have yet to be produced.

Multiple control variables and filters in a feedback based system are a process control nightmare.

So everyone please get your bets down the wheel she be a spinning.

reseller

2:19 pm on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



theBear

>In short it could be just about anything, including the implementation of parts of the patent to simple errors in the system to folks taking advantage of holes in the system.<

Exactly. And your guess is as good as mine ;-)

>So everyone please get your bets down the wheel she be a spinning.<

She? From the day I heard of that famous The Fat Lady whom took her a month to sing for us a simple song, I don´t like SHEs anymore ;-)

johnhh

2:44 pm on Jun 28, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



reseller:"The Fat Lady whom took her a month to sing for us "

I have a funny feeling she may stop singing - assuming she has actually sung!

theBear: "A more likely play would be that some folks are targeting keywords to produce income and are using known holes in the system "

Currently I can spot a number of sites above us that appear to have "won" using techniques that "lost" before. Hence the comment above, or perhaps it is in the part of the patent that has yet to be introduced as Bourbon appears to have allowed these sites through.

Although "Signals of Quality" may be in the eye of the beholder.