Forum Moderators: open
I checked some of my competition listed in the top 10 and did find one site that scraped some text from me.
What percentage is considered duplicate content? What percentage would trip this filter and why would my site be the one considered the duplicate content?
After investigating I found a website that has copied my content almost word for word, including my home page title, links and page filenames.
This really sucks! I have changed the title of my homepage in hopes that fresh bot will soon see the difference and put me back in the serps.
How can this happen? This filter needs to be engineered a little more carefully.
It is now easy for a dishonest person to destroy their competition by creating a site with exact duplicate content of their competator for the sole purpose of gaining the attention of this filter.
GoogleGuy ... if your out there, please shed some light.
Thanks both of yeah's. My site still isn't on it. for some reason my frontpage has been removed from that particular keyword. But still my side pages hold a rank of 280 for the keyword.
i just don't understand for the past 3 months both my sites have ranked 3 and 5. i have built up so many links that outweigh a few on the top ten.
What a nice update for me
If it is your homepage that is being filtered than do a search for your keyword and follow it with your complete homepage title in quotes.
sample query: widgets "my homepage title"
This will show a list of other sites where you can easily identify a content stealing spammer.
Google: This filter is too harsh and is punishing innocent people.
But sadly the duplicate content is on my own site!?!?
Confused? Our homepage (www.indexpage.com) disappeared recenlty, but I wasn't worried because it was also there as a separate result as indexpage.com, although the short url was an older version.
Now I'm wondering if that was considered duplicate content, therefore the (possible) penalty? Seems unlikely, no?
I too would like to know what to do, if in fact I am being penalized here...
[edited by: mipapage at 10:14 pm (utc) on June 16, 2003]
You can also send a C&D (cease and desist) letter to the infringer to get the content removed from the site. There are many sample C&Ds on the internet you can use to write one. It has all the scary legal limbo about monetary damages etc. If you send it via email (you can usually get the email from the site and/or whois info) put a deadline of 48-72 hours for when the infringed content must be removed by. If you decide to snail mail a copy, be sure to send it registered so that you have proof of receipt if you need to take further action.
You can also contact the infringer's host. Some will shut down sites that have infringed copy on it, as hosts usually don't want to find themselves on the receiving end of a lawsuit for something one of their hosting clients has done.
Here is another recent thread that discusses it, and what the .htaccess modification should be.
[webmasterworld.com...]
I was doing some research work for a prospective client, and found a very similar situation as you describe, where the copy on the client's page was subverted, and used to take users elsewhere. This was on www2 just as Dominic was being baked.
I just checked again, and the offending site is gone.
Another index is mysite.org, no www. It's a new link from an associate... emailed him to change it though www.mysite is #1
By the way I presume for the duplicate contect filter to be activated the extent of the copying must be quite high. I mean if somebody copied one of your paragraphs verbatim then I would assume this would not be enough for the filter to be implemented.
Please reply if your site(s) appear with the &filter=0 on and you cannot find any other sites with your content on them.
that code didn't work for me. I kept getting a 500 error until I rewrote it to look like this:
RewriteEngine on
RewriteCond %{HTTP_HOST}!^example.com/*$
RewriteRule ^.*$ [example.com%{REQUEST_URI}...] [R=301,L]
I have Apache 1.3.27, running on Linux.
my main page has fallen in the serps big time (since May), the ONLY thing that I can attribute it to is that my index page shows up 4 different times, each version with it's own PR and own rankings, even one listing where it has:
Title of Site: Main Theme
Description....
www.domain.com/
Title of Site: Main Theme
Description......
www.domainname.com/index.shtml
I also see listed in the serps the same pages two more times without the www. The page used to be top 5 for major KW's but is now either gone or burried (any one of the four versions of it), in it's place is an internal page, some search results that should show the main theme, list a page like "Warranty Information" 20-30 spots below where the main page was, some results show the main page as indented listing below some inconsequential interior page.
I see so many other reporting the same type of thing since dominic and it makes me wonder why google could tell before that this was the same page and why it's deciding to index all possable ways to reach a page file.
To me the problem is clear, quadrupal listings of the same page in the index, and I know they can solve it, becasue it was ot a widespread problem before Dominic.
I'm going to sit tight and not change a thing until I see how they sort this out.... jeeesh I hope they sort this out. I am happy for everyone who is doing business as usual, I'm doing my best to survive and laying off as few people as possable.
mipapage, you are welcome!