Forum Moderators: open

Message Too Old, No Replies

&filter=0

duplicate content?

         

textex

5:30 pm on Jun 16, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My site is missing for some of my terms in this update.
So, I tried adding "&filter=0" to the end of search strings and low and behold my site is listed #1 for my targeted terms.

I checked some of my competition listed in the top 10 and did find one site that scraped some text from me.
What percentage is considered duplicate content? What percentage would trip this filter and why would my site be the one considered the duplicate content?

my3cents

12:52 am on Jun 17, 2003 (gmt 0)

10+ Year Member



I have seen the problem before dominic, but not on such a widespread scale (maybe because it didn't effect me?) but I still have to ask, if nothing on my site changed, why all of the sudden would there be four listings for the same page.

Even when I do link:www.domain.com

it shouws results for:

domain.com
domain.com/index.shtml
www.domain.com/index.shtml

why would they show the same page as being a backlink to itself?

also, why is G not seeing that index.shtml IS the same as domain.com? It knows it's duplicate content, even grabbed the exact same description for all four listings.

I'm going to guess that this is something google will fix.

I am also seeing inctances where the same page is indexed more than four times, including tracking urls from paid ads. This really does not seem like something they would want to happen and it seems like something that would be simple to fix.

WebMistress

1:15 am on Jun 17, 2003 (gmt 0)

10+ Year Member



In the same boat...when do &filter=0, I get tons of internal pages first in SERPS, then further down mydomain.com, then even further down www.mydomain.com (which has all the inbound links)

To confuse matters, I did find a site that copied verbatim a paragraph, my title, and my description.

I also had my site on two servers at the beginning of May to help google find me during the transition from one server to the next.

So, I'm baffled as to which of these, if any, are the culprit of this duplication filter problem.

What a MESS!

WebMistress

1:35 am on Jun 17, 2003 (gmt 0)

10+ Year Member



More info: mydomain.com is PR0, www.mydomain.com is PR6, all internal pages showing are PR5

yet, here is the order in which they appear in the SERPs with &filter=0:

internal PR5 pages
mydomain.com PR0
www.mydomain.com PR6

Anything similar out there for any of you?

textex

1:58 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My site is new so I can't tell the page rank variations between www.mysite.com and mysite.com. I have since seen both in the index, but I also have seen the cloaked site using my content. So I don't know which one of these issues is causing my site not to be listed unless I use &filter=0
in the search query.

Should I wait for the index to settle?

Should I fill out a spam report?

Should I be concerned about having www.mysite.com and mysite.com both listed? Note that www.mysite.com and mysite.com only are returned in results when I search for the content of my site.

Thanks!

madweb

2:56 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



I just highlighted the first paragraph of my site and pasted it into the search box. Low and behold a site came up with my same title. I clicked on the site and it took me to amazon.com. When I click on the cache, its my site!
What should I do?

What you should do:

1. Your competitor is using cloaking. Fill in a spam report.
www.google.com/contact/spamreport.html

2. Your competitor is stealing your copyright content. Fill in a copyright report for google. I can't find URL right now.

Jenstar

3:53 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The infringement notice for Google can be found at [google.com...]

If you read message 23 of this thread, it will give you details on what to do when someone copies your content.

ogletree

4:16 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



All this &filter=0 thing is doing is the same thing as the link that you get at the end of a SERP.

"In order to show you the most relevant results, we have omitted some entries very similar to the 54 already displayed.
If you like, you can <link>repeat the search with the omitted results included.</link>"

GoogleGuy

4:35 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The "&filter=0" just shows all pages, even pages that appear to be duplicates. Normally if there are pages that appear to be duplicates, we try to pick the best one to show (but no heuristic can be perfect--it's like having two people claiming to own the same essay).

If you're seeing one of your own pages listed as a duplicate (e.g. domain.com/index.html shows up instead of www.domain.com/index.html), I wouldn't worry very much. That page should still appear in searches, and probably we can make it into the canonical version over time.

If someone else has copied your page, then you should try to solve the issue with them. In the worst case, you could escalate it to assert your ownership of your page.

This is not really a Google issue, other than the fact that "&filter=0" might allow you to find someone who copied your page. Even if we removed the other person's page (which we wouldn't do without a correct DMCA request, because we can't tell who really owns the page), you could still have the same problem with the other person's page showing up in other search engines.

So: Google tries to be the best reflection of the web that we can, but if someone has copied your page, that's a copyright issue, and not a spam issue. I encourage you to work with the other person to resolve the issue, because even if you did a DMCA request with Google, you'd still have to worry about those pages showing up on other search engines.

textex

5:01 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks GoogleGuy...
However, I really don't think this page is effecting my site from not being listed. It is a cloaked page that is not being returned in SERPS for keyword searches.

Is it possible that there is some sort of glitch?

FleaPit

5:08 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



Geees, this is quite a hard one GG. I know what you are saying is technically true but it can leave a lot people up the creek without a you know what! It also sounds like a relatively easy way to sideline a competitor although of course you could never tell which site was going to get dropped from the serps for being the duplicate.

One of my index pages (was there last month during dominic :) ) has also been caught out by this filter but I am struggling to find a site who has ripped me off. It is a competitive keyword with 3,000,000 in the results but thankfully it is not an important keyword for me as it is a generic relatively non-commercial term. Saying that, it was ranked no 8 last month (and for the last year or so) but this update with the &filter=0 applied it has gone up to no 5. It seems strange that Google finds my index page more relevant this update but chooses not to display it!

Are there any other causes GG or should we sit tight and wait until the dust settles?

needinfo

5:33 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



Does anybody know to what extent a page has to be copied to be tagged by the duplicate content filter.
I mean are we talking about the whole page... I really hope so because I've just found out that plenty of paragraphs from my sites have been copied.

Jenstar

5:40 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



but I am struggling to find a site who has ripped me off.

Go to your site and pick out a few unique phrases that are 5-10 words in length (but don't choose a phrase that contains your site or business name, as infringers will usually change that to their own site or business name). Then head over to the fi google server, and type in your unique phrase with " " around it. The " " means it will look for that exact phrase, with the words in that specific order. Be sure to add the &filter=0 on the end, or click the "repeat the search with the omitted results included."

You can try a few of your different unique phrases, and if someone has copied your content, chances are it will show up this way. I find many copyright infringers this way.

textex

6:07 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I am searching high and low to find out what is going on.

When I search for 5-10 phrases from the content of my site, the reutrns are www.mysite.com, mysite.com and the cloaked spammer (I just sent a cease and desist to them).

Can anyone shed some lite for me?

Help me out GoogleGuy!

Jenstar

6:21 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



textex - in the search without the filter, are either www.yoursite.com or yoursite.com coming up, or just the cloaker with your content in the cache? Or do they both only come up with the &filter=0 added to the results?

ogletree

6:27 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's funny on fi I do a site:domain.com -asdf my www.domain.com result does not show up unless I do a &filter=0 but if I do searchs where I know my www.domain.com comes up on the first page or so it shows up no problem.

textex

6:32 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



mysite.com is coming up, buried in results. But with &filter=0 www.mysite.com comes up in top placements.

What should I do?

WebMistress

6:33 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



Thanks for the input GoogleGuy. Always appreciated! However, we do worry when the page that is being thrown out is www.mydomain.com with its high PR and instead we are listed far down in the SERPs with mydomain.com with PR0. It affects our income. So, it's hard not to worry. I think more than worrying, we are trying to find out what we can do based on what we know about our individual situations that may be causing the problem. Such as my moving to a new server perhaps being the culprit. I added the code in htaccess to redirect mydomain.com to www.mydomain.com. Had I not worried and come here to investigate my options, I would not have tried this, instead of waiting for google to work it out over time. Any other suggestions would be very helpful to us, if you have any, GoogleGuy. I know your head must spin reading all of our desperate posts begging for your enlightenment, and we all know you can't possibly answer specifics for general problems. So, thanks for just checking in to see our concerns and posting what you can.

WebMistress

6:39 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



One more interesting thing I see now is that for some KW phrases www.mydomain.com comes up, whereas with others, mydomain.com or an internal page shows up. All KW phrases were previously #1 and #2 with www.mydomain.com before Dominic. I can't think of why if www.mydomain.com is indexed for one KW phrase, it is not indexed for an equal KW phrase, but mydomain.com which is optimized exactly the same as www.mydomain.com is indexed for that KW phrase.

?

Jenstar

6:40 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What should I do?

Change your .htaccess file to redirect the yourdomain.com to www.yourdomain.com (or vice versa) so Google can sort it out for the next update.

I think maybe some of the confusion has come about since people can now do directories as directory.yourdomain.com instead of just the traditional www.yourdomain.com/directory/

Because one site is coming up without the filter, I think it is judging the second as a duplicate. The change to .htaccess should correct it, and hopefully your rankings will change accordingly with the next update.

textex

6:40 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Same thing here webmistress...

polarmate

6:45 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



My index page shows up when &filter=0 is used. However, I cannot find another page that is a duplicate of my index page.

Dominic gave us a bad affiliate URL. Esmeralda has ensured that this bad URL is history. But my index page does not show for my main two keyword phrase - it shows for three or more keyword phrases. For the main keyword phrase, Google shows the index page of a sub-directory and that result is buried on the 5th or 6th page.

Unlike textex, my index page is *not* cloaked. I do not use cloaking anywhere on my site. I do not have hidden links nor do I have links from guestbooks. Hard work did pay off and Google is showing the number of true backlinks to my site - 490 now from 73 that Dominic slashed them to.

Can anyone please shed some light on why this is happening?

textex

6:49 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Polarmate, my site does not use cloaking at all. I found a site that copied my content. This site is cloaked but the cache is enabled, so I can see my content on the pre-cloaked page.

polarmate

6:52 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



Oops! Sorry, textex, I read that wrong. My apologies!

rustybrick

7:26 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can someone explain what they are typing into the google search bar to check if someone has duplicate content?

where does &filter=0 come into play?

FleaPit

7:35 pm on Jun 17, 2003 (gmt 0)

10+ Year Member



Do a normal Google search in fi and then tap the &filter=0 on the end of the url in the address bar and hit GO!

It works with any google string although fi is where most people are seeing the missing index page/duplicate content scenario.

[edited by: FleaPit at 7:39 pm (utc) on June 17, 2003]

mipapage

7:35 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Just add '&filter=0' to the end of the link for a serp in google to make it work. As to what it does, check GG's post earlier in this thread.


(BTW - it's not specific to -fi)

rustybrick

8:07 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks - see what you mean, it simply removes any filters.

What do you think of a case scenario where you have one site but google has it under multiple domain names.

For example:

i have a site that is listed under www.domain.com/

but i also have a secure domain name that is used for the parts of the site that require ssl (but i can use one shared cert).

so the same content is found at ht//domain.domainnameofsecureurl.com/

What are the implications of this scenario?

Thanks!

[edited by: rustybrick at 10:25 pm (utc) on June 17, 2003]

Jenstar

8:15 pm on Jun 17, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Put a robots.txt file to ban googlebot from your secure site, so you don't have to worry about the duplicate content.

BTW, you will need to remove your last Google link, as linking to specifics goes against the TOS. Most questions and comments can be answered without knowing your specific site name or URL.

theangler

12:23 pm on Jun 18, 2003 (gmt 0)

10+ Year Member



Hello,
I am new to this forum. I have read these posts and I seem to have the same problems with my index page. I was ranked #2 on my best keywords but starting last month my index page would drop will the Google-Dance was going on. It came back after the Dance was over. Now it is gone again and several of my other pages are showing up. The best one is at #9. The page that has showed has no meta tags, no description text and a low PR. This would not be much of a problem except that the page that is showing up does not have a description about our store. It is like opening a book and starting on chapter 3 instead of page one. From what I understand or think I understand is that I have duplicate content with another URL that I previous used but have since changed to a new URL. If this is what is happening, how do I correct the problem?
Thanks,

chrisnrae

1:22 pm on Jun 18, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"I encourage you to work with the other person to resolve the issue, because even if you did a DMCA request with Google, you'd still have to worry about those pages showing up on other search engines."

Good theory, but not quite the solution. Google has filtered one of my main site pages as a duplicate. Did some checking and found some moron who copied my main page and posted it on four different domains. My site has been top of the SERP's for a year, this guys domains are six months new. My site has 100 backlinks (some unrecipricated PR8's), this guys domains had 10 backlinks. Google chose to think MY site was the duplicate. LOL, the computer doesn't seem to be able to make a SMART decision re: who owns the duplicate and who owns the original. FWIW

Anyway, sent a C&D, guy removed my stolen content - but, since the content is no longer there, a DMCA won't help because the page no longer has my content on it. I have gotten it taken care of, but MY site is now being penalized. It should have a top 8 listing on the #1 keyword, instead has none on that keyword.

So, while you can take care of the issue CAUSING the problem, taking care of the problem itself (dup penalty) does not have a solution, aside to wait until the next update, by which time, my site may hit the same issue again. Sigh.

This 114 message thread spans 4 pages: 114