homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 114 message thread spans 4 pages: 114 ( [1] 2 3 4 > >     
duplicate content?

 5:30 pm on Jun 16, 2003 (gmt 0)

My site is missing for some of my terms in this update.
So, I tried adding "&filter=0" to the end of search strings and low and behold my site is listed #1 for my targeted terms.

I checked some of my competition listed in the top 10 and did find one site that scraped some text from me.
What percentage is considered duplicate content? What percentage would trip this filter and why would my site be the one considered the duplicate content?



 7:00 pm on Jun 16, 2003 (gmt 0)

I have a similar problem. When doing a search without the filter my site does not appear on the top page at all, with the filter=0 on my page appears on the top page.
Can anybody shed any light on this.


 7:02 pm on Jun 16, 2003 (gmt 0)

Check this out:


 7:15 pm on Jun 16, 2003 (gmt 0)

Yeah, I have seen this post. I started a new thread to try and get some answers to more specifics, like how much content is considered duplicate and why is my site the one that is considered duplicate.

Any ideas or input is appreciated.


 7:36 pm on Jun 16, 2003 (gmt 0)

When I use &filter=0, I see one site that figures 3 times on the first page:

And once on Page 2

And I see it again on Page 4 for a dummy page that is a redirect to www.domain1.com. The index page is 100% flash.

These are not indented results.


 7:37 pm on Jun 16, 2003 (gmt 0)

I have fallen victim to this same filter!

After investigating I found a website that has copied my content almost word for word, including my home page title, links and page filenames.

This really sucks! I have changed the title of my homepage in hopes that fresh bot will soon see the difference and put me back in the serps.

How can this happen? This filter needs to be engineered a little more carefully.

It is now easy for a dishonest person to destroy their competition by creating a site with exact duplicate content of their competator for the sole purpose of gaining the attention of this filter.

GoogleGuy ... if your out there, please shed some light.


 7:41 pm on Jun 16, 2003 (gmt 0)

For others who suspect this might have happened to them try this to find the culprit who copied your content.

Do this search on www2:

keyword "your homepage title here"

This is how I found the site that copied my content.


 8:08 pm on Jun 16, 2003 (gmt 0)

Fwiw - I found my missing index page by using &filter=0. It showed up indented to another listing (internal page) that turns up for the same SERP... In fact four other of our pages show up this way!

The result used to show the index page with the aformentioned internal page indented.


 8:13 pm on Jun 16, 2003 (gmt 0)

If someone can tell me how do you use this &filter=0 thing. how do you put in google


 8:15 pm on Jun 16, 2003 (gmt 0)

Do a normal Google search. Once you display the serps, go to the address bar, and at the end of the URL, type '&filter=0'. Then click GO (or hit ENTER) to display the SERPS for the same query without the duplicate content filter being applied.


 8:16 pm on Jun 16, 2003 (gmt 0)

Do your normal search using your keyword(s) then append the &filter=0 to the end of the URL in the address pane and hit return.


 8:19 pm on Jun 16, 2003 (gmt 0)


Thanks both of yeah's. My site still isn't on it. for some reason my frontpage has been removed from that particular keyword. But still my side pages hold a rank of 280 for the keyword.

i just don't understand for the past 3 months both my sites have ranked 3 and 5. i have built up so many links that outweigh a few on the top ten.

What a nice update for me


 8:52 pm on Jun 16, 2003 (gmt 0)

Ok, what does it mean if you site does NOT appear when search for "cheap widgets" and then DOES appear when adding the &filter=0 at the end? And the same site appears fine for "cheap widgets online" without the filter? ... Anyone?



 8:55 pm on Jun 16, 2003 (gmt 0)

For one term my page seems to have falled victim to this filter, but I can't find any duplicate content anywhere? Any Ideas?


 9:04 pm on Jun 16, 2003 (gmt 0)

same here. I've just sent you an email with a few questions about any possible similarites between our sites as we seem to have the same problems.


 9:07 pm on Jun 16, 2003 (gmt 0)

My freaking problem is some other guy in the industry copied my home page text word for word, not once, but TWICE onto two different domains and I am the one who got penalized? Spammers, you want your competitors banned while they sit with their hands under their rears? Here ya go. Sigh.


 9:23 pm on Jun 16, 2003 (gmt 0)

I'm not to keen on these dupe filters either, I've had couple of sites steal some of our content yet we are the ones who get penalised. How on earth can Google possibley tell which is the original?

Hmmmmnnn :¦


 9:27 pm on Jun 16, 2003 (gmt 0)

how did u find this out.


 9:42 pm on Jun 16, 2003 (gmt 0)

This is how I found the spammer that copied my site and got me filtered out of Google:

If it is your homepage that is being filtered than do a search for your keyword and follow it with your complete homepage title in quotes.

sample query: widgets "my homepage title"

This will show a list of other sites where you can easily identify a content stealing spammer.

Google: This filter is too harsh and is punishing innocent people.


 9:44 pm on Jun 16, 2003 (gmt 0)

I just highlighted the first paragraph of my site and pasted it into the search box. Low and behold a site came up with my same title. I clicked on the site and it took me to amazon.com. When I click on the cache, its my site!

What should I do?


 10:11 pm on Jun 16, 2003 (gmt 0)

Woah! Me too!

But sadly the duplicate content is on my own site!?!?

Confused? Our homepage (www.indexpage.com) disappeared recenlty, but I wasn't worried because it was also there as a separate result as indexpage.com, although the short url was an older version.

Now I'm wondering if that was considered duplicate content, therefore the (possible) penalty? Seems unlikely, no?

I too would like to know what to do, if in fact I am being penalized here...

[edited by: mipapage at 10:14 pm (utc) on June 16, 2003]


 10:12 pm on Jun 16, 2003 (gmt 0)

If someone has taken parts of your content and put it on another site, and that site is showing up in the Google serps, you can file an infringement notification with Google. All the information can be found here:

You can also send a C&D (cease and desist) letter to the infringer to get the content removed from the site. There are many sample C&Ds on the internet you can use to write one. It has all the scary legal limbo about monetary damages etc. If you send it via email (you can usually get the email from the site and/or whois info) put a deadline of 48-72 hours for when the infringed content must be removed by. If you decide to snail mail a copy, be sure to send it registered so that you have proof of receipt if you need to take further action.

You can also contact the infringer's host. Some will shut down sites that have infringed copy on it, as hosts usually don't want to find themselves on the receiving end of a lawsuit for something one of their hosting clients has done.


 10:19 pm on Jun 16, 2003 (gmt 0)

mipapage - you can edit your .htaccess file to automatically redirect all yourdomain.com to www.yourdomain.com This should eliminate the problem with the duplicate content, as well as any problems if the two are showing different pageranks as well.

Here is another recent thread that discusses it, and what the .htaccess modification should be.


 10:24 pm on Jun 16, 2003 (gmt 0)

TexTex, I would also file a spam report with Google, since the site that took your content is showing one thing to GoogleBot and another to users.

I was doing some research work for a prospective client, and found a very similar situation as you describe, where the copy on the client's page was subverted, and used to take users elsewhere. This was on www2 just as Dominic was being baked.

I just checked again, and the offending site is gone.


 10:45 pm on Jun 16, 2003 (gmt 0)

Looking at things with the &filter=0, I realized that a directory that I'm in has, in addition to their link to me, an actual snapshot of my index with the url "www.mysite.org/?theirsite.com" That page has a PR4 versus my index PR5, doesn't show in the obvious -fi serps, my site is #1, and this has been like this for a few months but... I'm getting paranoid on this dupe content stuff. Should I ask them to change it or remove me or something?

Another index is mysite.org, no www. It's a new link from an associate... emailed him to change it though www.mysite is #1


 11:04 pm on Jun 16, 2003 (gmt 0)

Is there anybody else out there who's site(s) appear when the &filter=0 is on and cannot find any other sites who seem to have copied your contect.

By the way I presume for the duplicate contect filter to be activated the extent of the copying must be quite high. I mean if somebody copied one of your paragraphs verbatim then I would assume this would not be enough for the filter to be implemented.

Please reply if your site(s) appear with the &filter=0 on and you cannot find any other sites with your content on them.


 11:05 pm on Jun 16, 2003 (gmt 0)


that code didn't work for me. I kept getting a 500 error until I rewrote it to look like this:

RewriteEngine on
RewriteCond %{HTTP_HOST}!^example.com/*$
RewriteRule ^.*$ [example.com%{REQUEST_URI}...] [R=301,L]

I have Apache 1.3.27, running on Linux.


 12:23 am on Jun 17, 2003 (gmt 0)


Thanks! Wouldn't ya know that I'm just putting together our htaccess file for our redesign, as our sites finally going dynamic. Thanks for the link and the help!


 12:39 am on Jun 17, 2003 (gmt 0)

Before I make any changes, especially to .htaccess I have to ask myself, why is this happening now and was not happening before Dominic?

my main page has fallen in the serps big time (since May), the ONLY thing that I can attribute it to is that my index page shows up 4 different times, each version with it's own PR and own rankings, even one listing where it has:

Title of Site: Main Theme

Title of Site: Main Theme

I also see listed in the serps the same pages two more times without the www. The page used to be top 5 for major KW's but is now either gone or burried (any one of the four versions of it), in it's place is an internal page, some search results that should show the main theme, list a page like "Warranty Information" 20-30 spots below where the main page was, some results show the main page as indented listing below some inconsequential interior page.

I see so many other reporting the same type of thing since dominic and it makes me wonder why google could tell before that this was the same page and why it's deciding to index all possable ways to reach a page file.

To me the problem is clear, quadrupal listings of the same page in the index, and I know they can solve it, becasue it was ot a widespread problem before Dominic.

I'm going to sit tight and not change a thing until I see how they sort this out.... jeeesh I hope they sort this out. I am happy for everyone who is doing business as usual, I'm doing my best to survive and laying off as few people as possable.


 12:46 am on Jun 17, 2003 (gmt 0)

The domain.com vs www.domain.com issue happened to me well before Dominic, so in that respect, it is not a new issue that was only discovered recently, and not a situation unique to Dominic. I think maybe people are just noticing it more with the Dominic and Esmeralda updates because they are also noticing vast differences in their results, and are looking for the reasons why.

mipapage, you are welcome!

This 114 message thread spans 4 pages: 114 ( [1] 2 3 4 > >
Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved