Forum Moderators: Robert Charlton & goodroi
I tried to call the number listed in their WHOIS to give them a friendly warning, but of course, it didn't work.
I've still got my fingers crossed that not all the data for this update is folded in, and it'll work out OK.
I've tried contacting YPN directly but can't find a direct contact form or email address. As it still is in Beta, they haven't thought to put up a contact form for those wanting to report TOS violators.
Either way, Google Engineers are too incompetent and too high from smoking $100 bills to develop an algo that can tell the simple difference between those that copy websites and the original.
I'm sure Cutts and the gang will tell you, "no, our duplicate content algo is fine, nothing wrong with it. We easily can tell the difference."
<sarcasm> I'm contemplating copying his blog and see if I can't prove my point that way. One of my 10,000 doorway pages will become #1 for "Matt Cutts Blog" and I'll laugh when his gets slapped with an auto penality.</sarcasm>
What about subdomains? If mysite.com gets filtered will it affect old or new subdomains?
Friends!
It could be the duplicate filter which is causing your problems, but also it could be something else. Have you ever thought about it.
May I suggest the followings:
The four of you exchange your urlīs (by stickies) and take a good look deep in your sites. Try to see whether the four sites of yours have something in common which might trigger a red flag. You donīt need to post your findings on this thread if you donīt wish to do so. We shall understand and respect that.
I thought this duplicate content thing was solved already. My main site was hijacked earlier this year, and now it looks like it is all happening again.
I think I am ready to be done with Google. For me, this means replacing all my AdSense ads with YPN ads or MSN ads when they get a contextual ad setup running, and paying more attention to optimizing for Yahoo and MSN. I think AdSense is the choice of spammers and sitejackers anyway, and Google doesn't seem to mind. In fact, I think that is the crowd they prefer - possibly because they think that is who "produces" for them.
And yes, if you haven't picked up on it yet, I'm extremely frustrated with Google right now!
I have had my site online since 2001, it was all done by hand html coding with wordpad, and I worked and tweak it constantly, and have done so almost every day since 2001 to present.
Amyway, I also ranked #1 for many of the search terms related to my site, for a long time, and google was my #1 source of traffic. Around the end of May of this year (2005) my site disapeared from google, I searched and found a few of my search phrases back around page 50 or worse lol.
Will here it is Sep 24 2005, and I am still the same, yesterday my search engine traffic was:
83% MSN
10% Yahoo
5% ask jeeves
2% Google
When will the site come back to google?
Who knows what google did this for to a site all designed and worked on for years by hand?
&filter=0 brings my site up to where it was before the "update" also.
&filter=0 works like a charm. My site is completely back when applying this parameter.
can confirm "&filter=0" does bring back my site too
If you like, you can repeat the search with the omitted results included.
Am I wrong?
Select some short snippets of text from your best pages, just a few words each.
Google for those, and see what shows up. Hopefully your pages are in there somewhere.
I'm not talking about keywords here, but unique phrases from the body of your text.
See if somebody else has copied (scraped) your content.
It is entirely possible that other sites were given credit for your work.
If OTOH you have been borrowing text from other sites, and only you can judge that,
then you may have gotten 'found out'.
Whenever a site suddenly drops in the SERPS, goes 'supplemental', or vanishes entirely,
the first thing I think of is duplicate content. -Larry
Perhaps the "not Google anymore" meme is approaching a tipping point.
My site was just reincluded after a 54 day ban and I'm really hesitant to bite the hand that has resumed feeding me but honestly folks this company just has too much power.
They must know that they are affecting ordinary people with hopes and dreams who are not aware of breaking the rules.
By now they MUST know that, and to continue with this slash and burn policy attributing it to "automation" is contrary to their "Do no evil" policy.
No this is not a conspiracy rant, it is reality as I perceive it.
This one is a little bit more than dupe (atleast in my case). Pages added in the last 2 days (have been dumped) and all the pages that have been dropped have been checked using copyscape for any dupe ... they're written by freelancers and we insist on original content.
I should have more about this on Monday... experimenting with one of the sites that has dropped.
For all we know ... google lost a subset of their data and its affected a number of sites -- but I doubt it.
I searched google for lots of my sites content, (sections of actual text etc) and only found my own sites. So I don't believe my site was duplicated, it's a site that would be kinda poor for that anyway, since each page sells one of my own products, that I make.
It just seems that Google did something major the end of May that made me lose off my ranking, and it hasn't come back yet as of today (sep 24)
I would contact Google about this - but it's taken four months just to get a response to a basic question...and that ended up being a cut-and-paste reply. Sigh.
[edited by: nutsandbolts at 4:23 pm (utc) on Sep. 24, 2005]
I think I am ready to be done with Google. For me, this means replacing all my AdSense ads with YPN ads or MSN ads when they get a contextual ad setup running, and paying more attention to optimizing for Yahoo and MSN. I think AdSense is the choice of spammers and sitejackers anyway, and Google doesn't seem to mind. In fact, I think that is the crowd they prefer - possibly because they think that is who "produces" for them.
The reason I'm frustrated about all of this is because all of the sites that have scraped my content and now rank higher than me have large adsense blocks prominently displayed at the top of their pages. I've made a good income from adsense, and it is extremely irritating to see how some can take it away without expending any real effort, and by using my content to do it.
After Allegra I used robots.txt and URL removal console to remove duplicate content. This was in March. After that I continously had a robots.txt with
User-agent: *
Disallow: dup1.php
...
Google states that the content removed by the console will stay removed for six months.
My site came back with Bourbon in May. After that I made a mistake. I've added two lines
User-agent: Googlebot
Disallow: someotherpage.html
These two lines were a time bomb.
As far as I know now this entry "User-agent: Googlebot" stops Googlebot from reading the lines below "User-agent: *".
Google states: "When deciding which pages to crawl on a particular host, Googlebot will obey the first record in the robots.txt file with a User-agent starting with "Googlebot." If no such entry exists, it will obey the first entry with a User-agent of "*"."
To say it in another way: If there is an entry "User-agent: Googlebot" I will never read "User-agent: *".
And thus my duplicate files (for printing and mailing articles) were not excluded anymore from being read by Googlebot.
Now I copied the complete "User-agent: *" section to "User-agent: Googlebot". And I hope my site will return soon.
I can encourage anyone to check their robots.txt for the same possible problem. I had to learn that the hard way.
We do have tons of duplicate content, since we are a news site, so use agencies like Reuters, AP etc. But we run a huge amount of unique content as well. I don't think this is to do with duplicate content...very odd.
There are cases where established site homepages and subpages are holding their ranking for one phrase, but dropping out of the SERP's for another closely related phrase (when the site previously ranked for both) ... and where there is no evidence of dup content filters playing a role where pages dropped out.
They've tweaked something else IMO. Possibly related to linking/anchor text/kw patterns.
IMHO: It's either related to 2 things.
1. Links
2. Duplicate content
For me, I have a few duplicators, but they are of such low quality, that it's unbelievable to me that Google is too incompetent to construct an algo that can't recognize between who is legit and who is not.
The other might be related to links. However, I build theme related links very, very slowly. Less then 5 a month.
Although it might be one of those 2, neither one is really a "glaringly obvious" problem.
Whatever it is, it better get rolled back.
[edited by: Freedom at 6:49 pm (utc) on Sep. 24, 2005]
soapystar, that looks to be the case with this site of mine... Same template throughtout the site. It's possible.
GG - why not request examples from webmasters.. just mention a code to add to those feedback forms!
[edited by: nutsandbolts at 6:52 pm (utc) on Sep. 24, 2005]
We do have thousands of links, since we often break news or media so sites link to us in the hundreds each week...often in a very short space of time. Plus as I said we do have thousands of pages of duplicate stories, but that is the only way you can cover certain world events. Plus although we run a lot of original content as well we sometimes license that out too...
I do hope it changes though or we will be in some trouble. Just don't realise how dependent you are on one company. Guess this is a sit and see.
I also had a look on Alexa (I know flaky but gives rought idea). I noted all our peers and similar sites have followed us in a big drop in traffic last few days.
[edited by: FattyB at 6:54 pm (utc) on Sep. 24, 2005]
Does Google think I am a Link spammer from scrapers?
Um, yeah.
I don't think links, or templates or anything has the slightest to do with this.
As mentioned above, sites seem to manage to hold onto (or at least not drop much for) some searches, while being dropped hundreds of spots for most things (and seldom ever gone completely out of the top 1000). Also, pages on a domain that have not been copied in any way (like those built a few days ago) also have a mega-drop in rankings, from #1 when not filtered to down hundreds in the regular search.
This is domain related. Specific pages don't have to be copied to be filtered. At the same time, the ridiculously inflated page counts seem to always exist, and it appears (like to hear any exceptions) that you always have to be over 1000 pages, meaning you can never check to see what any of these phantom pages are supposed to be.
It seems awfully advanced for Google to recognize that a domain has some high threshold of copying by other domains, and thus gets filtered for almost all searches -- although this could be the same sort of ill-conceived notion as the establishment of the Supplemental index.
In any case, I don't think people should go too far afield with this, or read too much into it in tin hat ways. &filter=0 corrects the problem... in my experience, it *always* corrects it. That one bit of information should tell Google how they massively screwed up, and tell them what they need to do to fix it. If it is an overall domain level of content theft that triggers it, it is doubtful that we as webmasters can do much of anything about it, since by definition the contetn theft will be widespread, and more importantly, in most cases HAVING THE STOLEN CONETENT REMOVED WILL HAVE NO EFFECT, because it is in the supplemental index (in most cases) and deleting supplemental pages does not get them deleted from the supplemental index.
Google Guy(s) and Google Gal(s), you know what you did. Stop doing it. It accomplished nothing positive. The results are virtually unchanged... except you are filtering out many of the most respected (and stolen from) domains in every niche.
My site scraped and now missing was registered by me in 98. Hard to believe the 250 sites that are listsed instead of me were registered before then. I'll bet not one of them was.
Can it be so difficult to sort this out?
What does google expect us to do rewrite the site for every update?