Forum Moderators: open

Message Too Old, No Replies

Different kinds of duplicate content on Google

Google shows different kinds of duplicate content

         

zgb999

5:16 pm on Jan 9, 2004 (gmt 0)

10+ Year Member



For one site we are tracking which pages are seen by Google as duplicate content (filter=0 / repeat the search with the omitted results included) for some time.

So far we had the following categories for pages:
- pages displayed with link:..……………………….Category 1….Category 2
- pages displayed with link: and filter=0…………………………………………………………………………………......Category 7
- pages displayed with site:………………………..Category 1……………………...Category 3….Category 4
- pages displayed with site: and filter=0………………………….Category 2……………………...Category 4….Category 7
- pages displayed with site: without description…………………………………....Category 5
- pages displayed with site: and filter=0 without description……….………..Category 6

I fully understand that only pages with enough pagerank are displayed with link: but what is the difference between a page that is seen as duplicate content (only shown with site: if filter=0)
- shown in link: (Category 2)
- shown in link: only when filter=0 (Category 7)

Does a page of Category 7 contain more duplicate content than a page of Category 2 (at least in the eyes of Google)?

What is the pattern and what does it mean when a page is in one of those categories?

Marketing_C

5:30 pm on Jan 11, 2004 (gmt 0)

10+ Year Member



We have many internal pages showing up with link:domain.com that are only seen in site:domain.com +"www.domain.com" when the "duplicate filter" is off. So for most pages that show with link: (without turning the "duplicate filter" off) the link: command is not seeing those pages as duplicates.

3 additional pages are shown with link:domain.com when "duplicate filter" is off.

So it really looks like Google is putting the pages in different duplicate-categories.

zgb999

10:10 am on Jan 12, 2004 (gmt 0)

10+ Year Member



In the light of the new Google Patent for duplicate content [patft.uspto.gov] it makes sense that they are seeing a lot of duplicate content that has to be sorted somehow. But it is difficult to see yet what kind of pattern they are using for that.

takagi

10:59 am on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At this forum the term 'duplicate content' is in general used for (almost) identical pages at the same or different sites.

Google will filter pages in the SERPs if the snippet (the 2 or 3 lines between the title and the URL) is (almost) identical. If every page on a site has the same footer like a copyright message, and you search for all pages on a site containing some keywords that can only be found in the footer, then only one or a few pages will appear in the filtered SERPs ignoring differences in the title and file size. These few pages will also appear at the top when the filter is turned off. This means that the order in the non-filterend SERP determines which ones are shown and which ones not. Normal SERPs are ordered for relevance of the keyword (in combination with PageRank, link text etc). The SERP of a 'link:' query is semi-random (maybe ordered on some algo using the DocumentID). Of course that will leave other pages filtered out.

doc_z

11:24 am on Jan 12, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Even for 'filter=0' it happens that pages are filtered out, e.g. a page (xyz.htm) isn't shown for 'site:www.domain.com domain' (using 'filter=0') but it appears for 'site:www.domain.com xyz'.

zgb999

1:08 pm on Jan 12, 2004 (gmt 0)

10+ Year Member



Thank you takagi for the info about the use of Snippets.

So what you are basically saying is that a page that is only shown when filter=0 has the same value for getting a good ranking thank any other page.

If page 1 has more content about keyword1 then page 1 will be shown in SERP. If page 2 has more content about keyword2 then page 2 is shown. All this despite the fact that only page 1 is shown with site: and page 2 is only shown when filter=0. Page 2 has no disadvantage for getting a good SERP in comparison to pages on other sites that are never filtered out by Google / that are seen without filter=0.

Is that correct?