Forum Moderators: Robert Charlton & goodroi
I did the rounds to check on the state of various data updates. I'd estimate that the "0.5" (not algorithmic changes, but rather responses to various spam/porn complaints + processing reinclusion requests) should go out this weekend sometime or possibly Monday. There should be a binary push this week to improve a corner-case of CJK-related search, and that new binary should have the hooks to turn on the third set of data. Regarding finishing up the second piece of data, there's still two data centers with older data. Those data centers will probably be switched over by Monday. By Monday, 2.5 of the 3.5 things will probably be on.
Clint
Just wondering what the www was resolving too - cant work out why your link count appears to have moved to the non-www figure for the www.PS. As you have a "Clint Type" site - seen any improvements yet.
Dayo
When all this started (for me May 21st), for the dozens and dozens of searches where I was FIRST; I was GONE, NON-EXISTENT for everything I searched for, even my own company name. All that has changed since then is I'm back on the 1st page #7 for my biz name. Whoopty-do. Yeah, that's ok, but who that searches for [products] is going to search for my biz name? No one. 1st - 6th is still those that link to me (still link exchange pages).
Every one of my former search phrases still show non-relevant sites and mostly from India! (I'm in the USA). If it's not India, it's "Asianet" or something like that. These are still sites with hidden text, link farms, "yellow pages" for specific cities only (which have nothing to do with the search phrases), on and on and on. None of these sites even SELL the products.
In some cases when it's not India/Asia, it's sites that link TO ME that are now showing up INSTEAD OF me (and one of those sites as I mentioned yesterday is 302'ing me). Again, None of these sites even SELL the products. Even searches where I was the ONLY HIT in G, I'm still GONE, and have been replaced again by sites in India (.in, sites and .za, .cz, .ru sites, etc).***
For one of the phrases I've been monitoring where I was 1st/1st before the 21st, several days after that for a day or so I got back on the FORTH page, then totally and completed removed from the G index for that search phrase...along with dozens of others. And I don't mean just put back several hundred places, I mean TOTALLY G-O-N-E from the index, erased.
***I wonder if the fact my website hosts are from India (company is in the USA though) have anything to do with this? All of these sites with other than .com extensions in other countries never even showed in the results prior to the 21st. And again, these sites don't even sell the products, some are exporters, and many of them are scammers, i.e. rip people off, similar, that's similar to the Nigerian 419 type sites but not exactly like it. The more I think about this the more it at least appears to me from my standpoint, that G was "hacked" or maybe "compromised" whatever you want to call it, by someone or a group. We all know (I guess?) that this is a G update, but is it REALLY? I'm rather delirious from all this, and again, THIS IS ONLY FROM MY STANDPOINT, but it's ALMOST like G may be trying to "repair" some damaged caused by "some group" for which they have been unsuccessful and this "damage" is continuing. Again, before anyone starts bashing me, I'M JUST THINKING OUT-LOUD here. It's like a really bad "B" movie:
"'Invasion of the Non-Relevant Websites-[Subtitle] The Attack on Google', Sneak-Preview Premiered May 21st, World-Wide release scheduled for sometime around June 30th. Synopsis: Everyone's favorite search engine Google is unexplainably infiltrated by one or more malicious codes which suddenly causes once top-ranking relevant websites to dive into cyberspace oblivion--only to be replaced by confusing and alarming results. The damage is speculated to have been introduced by low-ranking questionable websites mainly but not limited to Southeast Asia, but still has Google engineers baffled. Can Google get it fixed before thousands of decent legitimate website owners are ruined? FIND OUT JUNE 30th!"
Since I'm on the verge of losing just about everything, at least I haven't lost my sense of humor. An idea like that is personally easier for me to swallow and accept than Google causing this themselves. I just have a hard time believing they did all this on purpose. It's like "no, not my Google, no, say it ain't so".
Here's my somewhat simplistic explanation for what's happened to my site at least:
If you were ranked in the top 5, maybe top 10, for certain keyword strings - say, 3,4, or more KWs - before Bourbon, you were scraped (faux directory sites taking your title, meta tag description and KWs, and probably some top text) and linked off these "directories."
Because of changes to Google's algo - see my quoted box on previous post - these links were scored as poor quality links and your site was penalized by the algo.
I made a new posting yesterday Bourbon and the Recent Google Patent that has only just been released and so has got rather lost. It is along very similar lines and so it's probably more appropriate here now...
Like many others my site was hit hard in April and disappeared in late May (losing over >150,000 visitors a month). The various Bourbon threads have been very useful but so far mainly for ruling out any individual shared parameter apart from the presence of Adsense (which you’d assume wasn’t in Google’s interest). This seems to suggest that something new is at work.
Surely the most obvious candidate must be Google’s patent application 0050071741 which was passed on March 30th and which represents some very rare hard(ish) facts direct from Google about how it plans to improve its search quality. With 60 interacting parameters based primarily on historical document and link analysis it’s difficult to pull out what the effects would be – especially as these would vary across different sectors with, for example, old content sometimes seen as stale and sometimes definitive - but the most important point is that there’s no way that such new factors could be introduced to the mix without resulting in major winners and losers – just what we’ve seen recently. For my software reviews/articles site I can certainly see how links from years ago could now be heavily discounted (though I’d argue the content is still very useful)
Having said that, while I can understand dropping due to the changes, I’ve definitely been blacklisted/sandboxed which presumably means that Google has decided I - and everyone else who’s been affected so severely - is up to blackhat tricks which is definitely not the case. With the patent’s focus on link analysis that brings me back to the idea of too many recent links triggering a spam filter (perhaps with the use of AdSense being seen as a secondary spam indicator as the monitoring of ads is also mentioned in the patent). It’s certainly a possible explanation for what’s happened to my site as Google reports 3500 links many of which I would guess are from recent scraper sites picking up from directory listings and particularly Google’s own SERPS (Google’s previous patent “based on the interconnectivity of the documents in the set” would presumably explain why Google is so susceptible to scrapers and as my site was popular across a broad range of software-based keywords it was a natural scraper target).
I can see the benefit for google and the majority of searchers in clearing out the spam but, if this is what has happened, it means that honest and useful sites are being penalized simply for being popular on Google. And possibly for signing up for Adsense too!
Of course anti-spam false positives (whether as described above or not) are inevitable, but there needs to be some workable appeals procedure for removing undeserved blacklisting based on manual checking. OK it’s not algorithmic/scalable but with $50 billion in the bank I think something could and should be done for those who’ve lost out – the phrase “Don’t Be Evil” springs to mind
And as this forum seems to be the centre of Bourbon discussion/disgruntlement and I’m sure folk at Google are monitoring it, can we not do something about it? For example is there somewhere we can post our actual website addresses in the hope that they’ll fast-track us back into the SERPs if only to stop us moaning?
From Novice:
They would never publicly admit that webmasters that lived by that creed got penalised for something outside or their power.
From Danny:
There's always going to be some "collateral damage" from anti-spam measures.
And there's probably a bias in the feedback and reporting systems towards stringency on spam at the expensive of increased collateral damage - far more people will report spam in the search results than will report a web site they expected to find but didn't. Most searchers don't already know what they're looking for, after all.
I've been "wearing out" the area at G to complain about SERP's SPAM, and it has done no good. I don't even think they are read any longer.
Discusses G stock price bubble and the SE wars.
"In Google's case, its share of online searches is already falling (although, to be fair, it is still more than 50%) and margins seem to be contracting (though, again, from high levels) - and all before the competition reacts properly."
[edited by: Atticus at 3:12 pm (utc) on June 10, 2005]
you have the www.domain.com & domain.com problems, right? Maybe they'll fine tune the algo and choose one. If you think about it, since all pages are 100% identical, it isn't that hard (compare to the other things) for Google to choose one set with most weight and ignore the other.
>> There's always going to be some "collateral damage" from anti-spam measures.
sure, but how much? I think Google has gone way overboard lately. Let's hope that the new tuning will reverse that. I know, and I have repeated this, that at the end Google doesn't owe us anything, but with power comes responsibility (how original of me to use this line :)). If you control well over 50% of the search, you have a sense of responsibility to try to be fair. Personally, I'm encouraged by the two DCs that GG said to look for changes. More work, since I have to go back to updating that site, but work with reward is good in my book.
[edited by: walkman at 3:22 pm (utc) on June 10, 2005]
But what is a "Clint-style" site? I must have missed seeing Clint's site.
Well, yeah, that's the question isn't it? I'm a bit embarassed to admit that yesteday I went through many of the posts that mentioned "clint" in the Bourbon threads. (Using the printable view helps. ;) ) I came to some nebulous conclusions that made me think that my site might be in the same category. Maybe wishful thinking...
I suppose googleguy may have considered the characteristics of Clint's site (as described by him) and the symptoms that he's having and recognized what the issue was - without knowing the exact site involved.
But a bit risky - say the improvement goes out and Clints site is not how GG diagnosed (sp?) it.
Talk about kicking the guy in the teeth - when he is down.
>> Clint-type sitesyou have the www.domain.com & domain.com problems, right? Maybe they'll fine tune the algo and choose one. If you think about it, since all pages are 100% identical, it isn't that hard (compare to the other things) for Google to choose one set with most weight and ignore the other.
Walkman, I don't know if that's addressed to me or not. I don't know if anyone has ever been able to definitively say that was ever a problem. However I did change mine about a week or so ago for the non-www to 301 to my www.
a week is not enough. Unless Google speeds things up, it will at least a good month.
I received great news from G support this morning. There is no penalty on my site! yippie.
Sailor, I think you mean there is no manual penalty on your site. It obviously *WAS* penalized if your site got trashed. That proves you are another victim of the seriously flawed "algo changes" or whatever we want to call it. You are "another one of the babies that was tossed out with the bath water". A "peanut that got trashed with the peanut shells", and "ear of corn that went out with the husks", "a good chunk 'o meat that got trimmed along with the fat"....ok, you get my drift.
Exactly and *if* GG means that by Clint type sites in general (and the whole problem is fixed) then on his scale of 3.5 so far then that fix would be another 96.5
I really hope it is a non-www/www fix that we see.
[edited by: Dayo_UK at 3:56 pm (utc) on June 10, 2005]
I've pretty much given up hope.
Danny, I lost 70-75% of my Google referrals on March 23, and they came back in a matter of hours when the Bourbon update hit (except for a few pages that are MIA for unknown reasons).
So don't give up hope--and don't invest too much time on solutions that may not be needed, that may not work, and that conceivably might backfire later on.
Your site has the type of content that Google wants to index, so I'd expect it to recover sooner or later (though possibly not until the next mini-update or full update, if my recent experience is any guide).
I've pretty much given up hope.
I know what you mean. I'm still hovering between #94-#96, and then I see on one DC today I'm at #176 - oh joy.
The site's been 301'd for over a year, so non-www/www is not it. I've looked at 302's to my site, compare and contrast, but none seem to be the culprit. It must be all the lovely scraper directories and their "discounted" links - and there isn't a darn thing I can do about them.
EFV, you may be right - we may just have to wait for the next mini-update or update - who knows when that will be.
On the other hand diversification is working for me, offline.
LisaB
EFV - I just wonder if what happened to your site was seperate to the update - it just happened at the same time.
I'm sure it was because of the update, since the SERPS were completely reshuffled for many of the keyphrases that I track. The reshuffling happened quickly in late March (when template-based "broad but shallow" review sites seemed to be favored) and again in late May (when those template-based sites slipped in the SERPs that I watch).
subdomain,mydomain,com
These can be indexed, active links and I think they have the potential to contain an identical copy of your website, which might trigger the dupe filter, sometimes not in favor of the legitimate site.
It's hard to be sure without seeing the particulars.
Hope the 'Plex knows what it's in for :)
I can see the benefit for google and the majority of searchers in clearing out the spam but, if this is what has happened, it means that honest and useful sites are being penalized simply for being popular on Google. And possibly for signing up for Adsense too!Of course anti-spam false positives (whether as described above or not) are inevitable, but there needs to be some workable appeals procedure for removing undeserved blacklisting based on manual checking. OK it’s not algorithmic/scalable but with $50 billion in the bank I think something could and should be done for those who’ve lost out – the phrase “Don’t Be Evil” springs to mind
I'm starting to worry that with the "scrapers outranking original" problem there is nothing they currently can can do because it really is all automated. "no penalty" as used by support does not include the type of filtering sites like Danny's and ours have experienced which effectively kills all Google referrals but leaves most pages in the index.
it is...but humans still control the automation (I hope so anyway ;).
They can loosen and tighten the filters as they please. Manual intervention might happen for hugely popular sites, but for an average site, I wouldn't hold my breath.
GG - this update for "Clint-Type" sites - are we on course for tonight - or should I call it a day here in the UK.
Hope factor at about 6% - Thoughts that I have a canonical url problem upto about 25% :) - alcohol level increasing (tis friday)
To be honest with you I did not read all posts here but I'd have some comments:
- did all of you check your allinanchor results? Competition grow and being on top of SE may not be forever.
- a business plan relying only on search engines seems weak strategy for me
- If the "danny" here is the one i'm thinking, may be selling text links to totally unrelated websites is not the best (talking about internet.commerce on left navigation of your site) + widely advertising textlink based ads on a search engine marketing authority site would not please GG
- as for scrappers this is really a problem on all SE, still Google is better than MSN and i'm not even talking about yahoo's (ridiculous when no human review)
Just my 2 cents...
how about for your business' name, which many folks are reporting they've lost ranking on. Or your home page's unique title? Shouldn't that be forever?
< a business plan relying only on search engines seems weak strategy for me
your point is?
< may be selling text links to totally unrelated websites
i think you have the wrong site. i also think it's lame to post those sorts of things here.
< Google is better than MSN
ok
That was me (it is .com, not ,com). In this case clicking on it takes you to a "page cannot be found". The subdomain was one which was used by an advertiser with us about 3-5 years ago and the subdomain has since been absorbed and redirected to another domain owned by the people who bought out the old company. As far as I know, there was never any java script on the pages of that site and we never linked to the new site and vice versa. I also got an e-mail this morning from G-support finally that they were escalating the issue to engineering.
I find the fact that G is also indexing links to G-cached pages (as I mentioned previously) even scarier since by definition these would contain an exact copy of our page (plus a little extra) with a different URL, presumably on an "authoratative" IP, which could lead to duplication penalties with THEM being the primary source.
edit: Oh, also forgot to mention. I noticed some interesting changes on our local datacenter (64.233.187.104). We previously had some pages indexed as non-www. and overnight they all changed to www. Could be I was looking at a different DC last night, but this one definitely has them all www'd. We added our non-www to www redirect code over a month ago.
[edited by: MikeNoLastName at 8:09 pm (utc) on June 10, 2005]