JuniorOptimizer, thanks for the suggestion - I'll contact Google if it's not sorted out over the next couple of days - I'll probably just go for the [google.com...] option.
I tried to call the number listed in their WHOIS to give them a friendly warning, but of course, it didn't work.
I've still got my fingers crossed that not all the data for this update is folded in, and it'll work out OK.
In my case, stated earlier, it strongly appears to be copycat publishers (I found 2 major violators) using my content to cloak and redirect visitors to doorway pages. These doorway pages have no editorial value other then to focus the Yahoo Publisher Network ads that are at the top.
I've tried contacting YPN directly but can't find a direct contact form or email address. As it still is in Beta, they haven't thought to put up a contact form for those wanting to report TOS violators.
Either way, Google Engineers are too incompetent and too high from smoking $100 bills to develop an algo that can tell the simple difference between those that copy websites and the original.
I'm sure Cutts and the gang will tell you, "no, our duplicate content algo is fine, nothing wrong with it. We easily can tell the difference."
<sarcasm> I'm contemplating copying his blog and see if I can't prove my point that way. One of my 10,000 doorway pages will become #1 for "Matt Cutts Blog" and I'll laugh when his gets slapped with an auto penality.</sarcasm>
How does this near duplicate filter penalty work? My pages have unique content and they're not near duplicates. Some of my competitors have 100K pages that are 95% similar and they're fine. So what is it exactly that the filter looks for?
What about subdomains? If mysite.com gets filtered will it affect old or new subdomains?
It could be the duplicate filter which is causing your problems, but also it could be something else. Have you ever thought about it.
May I suggest the followings:
The four of you exchange your urlīs (by stickies) and take a good look deep in your sites. Try to see whether the four sites of yours have something in common which might trigger a red flag. You donīt need to post your findings on this thread if you donīt wish to do so. We shall understand and respect that.
For me it just seems that there are a lot of copies of my content over the web and the original has been flaged as the duplicate- I think I might start sending emails on Monday and then C&Ds and go through the DCMA - bit of a pain, but it's costing me $$$/day so worth it - and in the future I'll keep an eye on who's copying me.
&filter=0 brings my site up to where it was before the "update" also.
I thought this duplicate content thing was solved already. My main site was hijacked earlier this year, and now it looks like it is all happening again.
I think I am ready to be done with Google. For me, this means replacing all my AdSense ads with YPN ads or MSN ads when they get a contextual ad setup running, and paying more attention to optimizing for Yahoo and MSN. I think AdSense is the choice of spammers and sitejackers anyway, and Google doesn't seem to mind. In fact, I think that is the crowd they prefer - possibly because they think that is who "produces" for them.
And yes, if you haven't picked up on it yet, I'm extremely frustrated with Google right now!
I am brand new here, and am glad I just found this forum, so I want to first say 'hello' to everyone
I have had my site online since 2001, it was all done by hand html coding with wordpad, and I worked and tweak it constantly, and have done so almost every day since 2001 to present.
Amyway, I also ranked #1 for many of the search terms related to my site, for a long time, and google was my #1 source of traffic. Around the end of May of this year (2005) my site disapeared from google, I searched and found a few of my search phrases back around page 50 or worse lol.
Will here it is Sep 24 2005, and I am still the same, yesterday my search engine traffic was:
5% ask jeeves
When will the site come back to google?
Who knows what google did this for to a site all designed and worked on for years by hand?
Bluegill catcher--Google does it to drive your little business to advertise with them!, they are doing this to all the high ranking old sites--thinking that our sites make big bucks, even when majority of us started a small mom and Pop operations ahead of everyone else--that's our only fault--that we were visionaries who were here first--so we get dropped off of Google. JUST THE WAY THAT GOOGLE KEEPS IT ALL A SECRET IS THE PROOF THAT THEY DROP HUNDREDS OF THUSANDS OF SITES TO DRIVE THEM TO ADVERTISE--IT ALL STARTED SINCE GOOGLE WENT PUBLIC!--DUH..INCREASE PROFITS THROUGH ADVERTISING!
They don't care if some (most) of us are just regular tiny businesses trying to support families and kids--even evil ebay--when they knock you out they tell you why--google doesn't even bother to do that. CREEPS!
I keep seeing many posters said that a search with &filter=0 will show the pre-updated results.
|&filter=0 brings my site up to where it was before the "update" also. |
|&filter=0 works like a charm. My site is completely back when applying this parameter. |
|can confirm "&filter=0" does bring back my site too |
I don't think it's is true. When you do a normal search (without &filter=0), Google will show only the most relevant results and omitted some entries very similar, if any, to the already displayed. And
|If you like, you can repeat the search with the omitted results included. |
If you click on "repeat the search ...", you choose a search with &filter=0. So &filter=0 search is nothing to do with before or after update, if there is an update.
Am I wrong?
Bluegill, try this:
Select some short snippets of text from your best pages, just a few words each.
Google for those, and see what shows up. Hopefully your pages are in there somewhere.
I'm not talking about keywords here, but unique phrases from the body of your text.
See if somebody else has copied (scraped) your content.
It is entirely possible that other sites were given credit for your work.
If OTOH you have been borrowing text from other sites, and only you can judge that,
then you may have gotten 'found out'.
Whenever a site suddenly drops in the SERPS, goes 'supplemental', or vanishes entirely,
the first thing I think of is duplicate content. -Larry
|Perhaps the "not Google anymore" meme is approaching a tipping point. |
My site was just reincluded after a 54 day ban and I'm really hesitant to bite the hand that has resumed feeding me but honestly folks this company just has too much power.
They must know that they are affecting ordinary people with hopes and dreams who are not aware of breaking the rules.
By now they MUST know that, and to continue with this slash and burn policy attributing it to "automation" is contrary to their "Do no evil" policy.
No this is not a conspiracy rant, it is reality as I perceive it.
This one is a little bit more than dupe (atleast in my case). Pages added in the last 2 days (have been dumped) and all the pages that have been dropped have been checked using copyscape for any dupe ... they're written by freelancers and we insist on original content.
I should have more about this on Monday... experimenting with one of the sites that has dropped.
For all we know ... google lost a subset of their data and its affected a number of sites -- but I doubt it.
Sites not scraped but still dropped from google the end of May and still gone as of Sep 24.
I searched google for lots of my sites content, (sections of actual text etc) and only found my own sites. So I don't believe my site was duplicated, it's a site that would be kinda poor for that anyway, since each page sells one of my own products, that I make.
It just seems that Google did something major the end of May that made me lose off my ranking, and it hasn't come back yet as of today (sep 24)
The filter=0 does bring the site back on page two (normally number 1 for that particular keyword) and after five years on the Internet, Google now thinks it's taken content from sites that just popped up a few months ago.
I would contact Google about this - but it's taken four months just to get a response to a basic question...and that ended up being a cut-and-paste reply. Sigh.
[edited by: nutsandbolts at 4:23 pm (utc) on Sep. 24, 2005]
I think I want to back off slightly from my previous statement:
|I think I am ready to be done with Google. For me, this means replacing all my AdSense ads with YPN ads or MSN ads when they get a contextual ad setup running, and paying more attention to optimizing for Yahoo and MSN. I think AdSense is the choice of spammers and sitejackers anyway, and Google doesn't seem to mind. In fact, I think that is the crowd they prefer - possibly because they think that is who "produces" for them. |
The reason I'm frustrated about all of this is because all of the sites that have scraped my content and now rank higher than me have large adsense blocks prominently displayed at the top of their pages. I've made a good income from adsense, and it is extremely irritating to see how some can take it away without expending any real effort, and by using my content to do it.
Yahoo dump also....... also has anyone noticed a HUGE drop in yahoo traffic as of yesterday. I am not getting 85% of my traffic just from MSN, yahoo has plummeted and I was told my someone else, just now, that yahoo started a major update yesterday! May be as bad as google
In my case I think, I have found the source of my problem.
After Allegra I used robots.txt and URL removal console to remove duplicate content. This was in March. After that I continously had a robots.txt with
Google states that the content removed by the console will stay removed for six months.
My site came back with Bourbon in May. After that I made a mistake. I've added two lines
These two lines were a time bomb.
As far as I know now this entry "User-agent: Googlebot" stops Googlebot from reading the lines below "User-agent: *".
Google states: "When deciding which pages to crawl on a particular host, Googlebot will obey the first record in the robots.txt file with a User-agent starting with "Googlebot." If no such entry exists, it will obey the first entry with a User-agent of "*"."
To say it in another way: If there is an entry "User-agent: Googlebot" I will never read "User-agent: *".
And thus my duplicate files (for printing and mailing articles) were not excluded anymore from being read by Googlebot.
Now I copied the complete "User-agent: *" section to "User-agent: Googlebot". And I hope my site will return soon.
I can encourage anyone to check their robots.txt for the same possible problem. I had to learn that the hard way.
I notice the same things regards the filter. But not for duplicate content, it just shows our newer sub domains. Whereas the standard search shows our results as they were 6 months ago with the main sub domains we had then showing.
We do have tons of duplicate content, since we are a news site, so use agencies like Reuters, AP etc. But we run a huge amount of unique content as well. I don't think this is to do with duplicate content...very odd.
I have never used any 'robots.txt'tags or files in any of my html since 2001 to present.
I just use standard meta tags for description, key words, and title
I agree with Shri. The dup issue that steveb outlined so well explains some of what I see, but not all of it.
There are cases where established site homepages and subpages are holding their ranking for one phrase, but dropping out of the SERP's for another closely related phrase (when the site previously ranked for both) ... and where there is no evidence of dup content filters playing a role where pages dropped out.
They've tweaked something else IMO. Possibly related to linking/anchor text/kw patterns.
almost looks like a template filter....
not the content thats filtered but repeated templates/navigation.....almost as though its NOT bothered about the content that goes along with it...
Strange, my main travel site is hijacked by a religious site. The religiuous site is an old site from 1998, no pagerank and powered by Jesus Christ according their logo.:) They copied the whole bible and my site. No adsense or anything so somehow I think they copied the whole DMOZ for no reason. Can't find my sitemeter as well on the pages but the rest of my site is under their URL inclusive my affiliate links and logo's. I am getting 20% of my usual traffic since 22 september. I wonder if some of my sales are from them as well.
I am not religious myself so I can laugh about it when a thief writes his site is Powered by Jesus Christ. But now I want be back in the SERPS. I looked up their address in Whois and mailed them. Maybe they are just ignorant.
I also confirm that adding &filter=0 - brings back my website to it's original ranking.
IMHO: It's either related to 2 things.
2. Duplicate content
For me, I have a few duplicators, but they are of such low quality, that it's unbelievable to me that Google is too incompetent to construct an algo that can't recognize between who is legit and who is not.
The other might be related to links. However, I build theme related links very, very slowly. Less then 5 a month.
Although it might be one of those 2, neither one is really a "glaringly obvious" problem.
Whatever it is, it better get rolled back.
[edited by: Freedom at 6:49 pm (utc) on Sep. 24, 2005]
Is it possible those duplicators use old sites to do their work? My duplicator has a very old site (1998) an no PR on the pages with dupe content and still it pushes away my PR5 site in the SERPS with my own content.&filter=0 bring back my site. Even when I try a whole phrase of my homepage only with &filter=0 I can be found but the duplicator shows up. I have the suspicion it has something to do with the age of the site, not the age of the page.
> almost looks like a template filter....
soapystar, that looks to be the case with this site of mine... Same template throughtout the site. It's possible.
GG - why not request examples from webmasters.. just mention a code to add to those feedback forms!
[edited by: nutsandbolts at 6:52 pm (utc) on Sep. 24, 2005]
Well all I know is traffic is down from a few hundred thousand unique a day to 20-40 or so...who are probably the ones who visit every day.
We do have thousands of links, since we often break news or media so sites link to us in the hundreds each week...often in a very short space of time. Plus as I said we do have thousands of pages of duplicate stories, but that is the only way you can cover certain world events. Plus although we run a lot of original content as well we sometimes license that out too...
I do hope it changes though or we will be in some trouble. Just don't realise how dependent you are on one company. Guess this is a sit and see.
I also had a look on Alexa (I know flaky but gives rought idea). I noted all our peers and similar sites have followed us in a big drop in traffic last few days.
[edited by: FattyB at 6:54 pm (utc) on Sep. 24, 2005]
I always thought a duplicate content penality might apply to a certain page, and not the entire site. Which doesn't explain a site wide drop in rankings. However, "link spam" could explain a sitewide drop in rankings, better then a duplicate content penality does.
Does Google think I am a Link spammer from scrapers?
> Does Google think I am a Link spammer from scrapers?
Maybe when the domain name of the scraper is registered a long time ago? It can't be the quality of the scrapers.
"Am I wrong?"
I don't think links, or templates or anything has the slightest to do with this.
As mentioned above, sites seem to manage to hold onto (or at least not drop much for) some searches, while being dropped hundreds of spots for most things (and seldom ever gone completely out of the top 1000). Also, pages on a domain that have not been copied in any way (like those built a few days ago) also have a mega-drop in rankings, from #1 when not filtered to down hundreds in the regular search.
This is domain related. Specific pages don't have to be copied to be filtered. At the same time, the ridiculously inflated page counts seem to always exist, and it appears (like to hear any exceptions) that you always have to be over 1000 pages, meaning you can never check to see what any of these phantom pages are supposed to be.
It seems awfully advanced for Google to recognize that a domain has some high threshold of copying by other domains, and thus gets filtered for almost all searches -- although this could be the same sort of ill-conceived notion as the establishment of the Supplemental index.
In any case, I don't think people should go too far afield with this, or read too much into it in tin hat ways. &filter=0 corrects the problem... in my experience, it *always* corrects it. That one bit of information should tell Google how they massively screwed up, and tell them what they need to do to fix it. If it is an overall domain level of content theft that triggers it, it is doubtful that we as webmasters can do much of anything about it, since by definition the contetn theft will be widespread, and more importantly, in most cases HAVING THE STOLEN CONETENT REMOVED WILL HAVE NO EFFECT, because it is in the supplemental index (in most cases) and deleting supplemental pages does not get them deleted from the supplemental index.
Google Guy(s) and Google Gal(s), you know what you did. Stop doing it. It accomplished nothing positive. The results are virtually unchanged... except you are filtering out many of the most respected (and stolen from) domains in every niche.
Date of registry has nothing to do with it IMO.
My site scraped and now missing was registered by me in 98. Hard to believe the 250 sites that are listsed instead of me were registered before then. I'll bet not one of them was.
Can it be so difficult to sort this out?
What does google expect us to do rewrite the site for every update?
So, you when these type of things happen Google "always" corrects them/rolls them back later?