Is that really supposed to be labelled as 'duplicates'?

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Is that really supposed to be labelled as 'duplicates'?

shaunm

4:20 pm on May 5, 2016 (gmt 0)

Can you guys help me with why the following practice is not considered as duplicates? I can see how those famous news articles sites use infinite scrolling where the URL in a browser's address bar is changing to corresponding page as people scroll down. In those cases, you don't get to see the content on other URLs when viewing the source code or disabling the JS on your browser. I suppose it's one of the fair usages of the (modern) technologies that don't really get you into troubles.

But when sites have a overview kind of home page that show up the recent articles one by one, why isn't it considered as duplicates?

For example,
[webmasters.googleblog.com...]
This page has the exact same content as the latest blog post page [webmasters.googleblog.com...]

What am I missing here? Thanks for enlightening me!

Andy Langton

12:58 pm on May 6, 2016 (gmt 0)

When you say "labelled" as duplicates, what do you mean? If you mean, for instance, Google's message at the end of search results saying that some were omitted, then this is keyword-specific. The blog homepage is not a good match for the specific article content, but it is a good match for "Webmaster blog" etc. In general, this is how duplication works in Google. They have no desire to show the same content over and over in results. I'm not aware of any specific labelling of pages as being duplicates - or why it would matter in the example you gave.

shaunm

2:27 pm on May 6, 2016 (gmt 0)

I'm sorry if I wasn't clear. My question is as simple as that, are the examples duplicates or not?

Google on this page [support.google.com...] says, Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Now that web master blog home page has the same content as the other page I mentioned. How is this not considered as duplicates in Google's eyes? What am I missing here?

Also, what is the benefit of showing the complete content in the home page where a simple summary would be a lot better when it comes to user experience?

Andy Langton

2:51 pm on May 6, 2016 (gmt 0)

I understand that this is an example of duplication - I don't understand why it matters. See Google's quote on the page you linked:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

a simple summary would be a lot better when it comes to user experience?

I don't think that's clear at all. Having a summary won't work well if the majority of their users go to the blog homepage for the latest entry, for instance. It depends entirely on their audience's behaviour.

Would it better from an SEO point of view to not have the whole article on the front page? Theoretically, but what damage can this do in this example?

shaunm

3:39 pm on May 6, 2016 (gmt 0)

I understand that this is an example of duplication - I don't understand why it matters

unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results

I don't think I'm making it over complicated here. As far as I know, no block of content should be available on more than one URL. If it's so, then we have the options to use rel=canonical to indicate the preferred URL for indexation. I don't want to wait for Google to make the call about whether the duplicate content is manipulative or not.

As for the examples I provided, I take a snippet from either of the pages and put it in Google search and see both pages showing up in #1 & #2 positions. If it's my site that's showing up two different URLs for the same snippet then I would be freaked. And I will find why they have the same content and if necessary I would go with the rel=canonical.

[screencast.com...]

I understand that's a blog home page without manipulative intentions. All I wonder is, how does Google knows that it's not manipulative? Will I be under duplicate penalization if I create such a blog home page? Thanks!

not2easy

4:06 pm on May 6, 2016 (gmt 0)

I would not rely on what Google does to look for good examples of best practices. Google is not concerned about their rankings or being at the top of serps. Their own sites do not always follow their recommendations.

Andy Langton

4:28 pm on May 6, 2016 (gmt 0)

how does Google knows that it's not manipulative?

Google are talking about extreme cases - for instance stealing content, using the same content but mixing up the order, replacing words here and there to try to rank someone else's content. You would know if your duplicates were manipulative - it would be obvious.

If it's my site that's showing up two different URLs for the same snippet then I would be freaked

Why? There's no punishment for having even exact duplicates. There are, however, two potential downsides:

- Google chooses which page to rank (which may not be in your favour)
- You might lose link value if Google decides that a duplicate with links is a 'worthless' duplicate or dilute internal link value by having links to multiple variations

Neither applies to the example you've given. In addition, practically every site that displays a list of content - be that blog posts or ecommerce categories - will have duplicate snippets of text. On a large scale, duplicates can be a serious issue, and on smaller scales they can be a problem, but having something duplicated is not a negative factor in itself at all.

Of course, you're well advised to avoid letting Google "do a good job of choosing a version" and make that choice yourself, but in the example you've given there's nothing scary about Google's approach. It would be near the bottom of my list if I was optimising their blog for them, and it would only make the list at all if their stats did not show that their current format worked best for visitors.

I would not rely on what Google does to look for good examples of best practices.

I completely agree. They don't even rank for "search engine" ;)

tangor

6:20 pm on May 6, 2016 (gmt 0)

Unless your blog homepage only ads and never drops off older entries you might have a problem ... then again who wants a 10,000 snippet home page? Put a different way, the blog homepage usually shows the most current info/interest and the older stuff moves to archives, so that page would be fresh on a routine basis. I think the search engines can figure out the difference/intent.