Welcome to WebmasterWorld Guest from 54.196.86.89

Forum Moderators: Robert Charlton & goodroi

Solving duplicate content issues

     
12:41 pm on Apr 19, 2018 (gmt 0)

Full Member

10+ Year Member

joined:July 26, 2006
posts:323
votes: 11


Google Search Console is showing: "Duplicate title tags" and "Duplicate meta descriptions"

The problem seams to be capitals and non-capitals for example

/Pages/widgets.htm
/pages/widgets.htm

and in some cases

/Pages/Widgets-Blue-Big.htm
/pages/widgets-blue-big.htm

Would the best solution be 301 Redirects from all /pages/ to /Pages/
1:22 pm on Apr 19, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


Would the best solution be 301 Redirects from all /pages/ to /Pages/

Yes.

You might also want to search why you have upper and lower case URL for the same content.
10:36 pm on Apr 19, 2018 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:12075
votes: 334


/Pages/Widgets-Blue-Big.htm
/pages/widgets-blue-big.htm

Would the best solution be 301 Redirects from all /pages/ to /Pages/

As Travis suggests, "Pages" isn't the only word in the above url that uses mixed upper and lower case. I always recommend using all lower case in urls.

Otherwise, inevitably, someone is going to do something different... be it a coder or a designer or someone linking to you.... Or you yourself are going to forget what you did. It always happens. I'd suggest all lower-case. And yes... 301s are the way to handle that.

Assuming Apache, use mod_rewrite, and be careful to avoid chained redirects. I'm not an Apache programmer, but I think it's also easier to force everything to lower case than to selectively program it or to redirect your urls individually.

You may have something else going on with the way things are set up as well... but this does look like a human choice. Perhaps internal nav related, in which case you should not rely on 301s alone to fix this. You should fix the internal inconsistancies... then apply the 301s to redirect references that might persist for a long while in caches, logs, and Google's data.

This is one of many things that's helpful to figure out in advance, rather than encounter it later and have to fix it.

9:25 am on Apr 20, 2018 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Jan 11, 2015
posts:67
votes: 1


index.php
https://www.example.com

to delete duplication between the domain name and the index.php file I inserted this tag:

<link rel="canonical" href="https://www.example.com" />

it's correct?


Mod's note: Use example.com... it can never be owned, and it doesn't autolink in forum software.


[edited by: Robert_Charlton at 10:05 am (utc) on Apr 20, 2018]
[edit reason] exemplified sample site name [/edit]

10:07 am on Apr 20, 2018 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:12075
votes: 334


I would not depend on the canonical link tag to fix bad coding. Among other things, given certain site situations, it will not scale.

10:21 am on Apr 20, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


canonical link

Serving the same content from different URLs shows a problem of coding/organization, which is what needs to get all your attention.

Canonical links should be avoided as much as possible, because it unnecessarily exhausts your "crawl budget".
11:24 am on Apr 20, 2018 (gmt 0)

Junior Member

Top Contributors Of The Month

joined:Jan 11, 2015
posts:67
votes: 1


remove ?

<link rel = "canonical" href = "https: / /www.example.com" />
6:31 pm on Apr 20, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14710
votes: 614


Assuming Apache
My first thought was Uh-oh, they’ve got mod_speling enabled. One of its functions is to flatten casing, so PAGE.html, page.html and PaGe.html all serve the same content--not by redirect, but at the originally requested URL. If this is the explanation, and it is in your power to do so, get rid of it.

A "canonical" tag may make search engines stop complaining about duplicate content, but by itself it won’t stop the requests.

This is the google subforum, but it should be noted that bing is especially fond of requesting wrongly cased URLs. Or perhaps “conventionally cased” is a better term; I see them requesting pagename.html for pages that (for historical reasons) are actually called PageName.html. Fortunately I don’t have mod_speling--I think my host used to use it, but wisely abandoned it--so requests get the 404 they deserve.
8:47 am on Apr 21, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Mar 25, 2018
posts:500
votes: 92


remove ? <link rel = "canonical" href = "https: / /www.example.com" />

Not necessarily. "canonical" can be used to fix problems about the same content being served under different URLs, but it's again better to find the reason of the problem, and to fix it as the source. And once you succeed to serve the content at just one URL,you set up 301 redirects from the previous other URLs.
10:32 pm on Apr 26, 2018 (gmt 0)

Moderator This Forum from US 

WebmasterWorld Administrator robert_charlton is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Nov 11, 2000
posts:12075
votes: 334


"Duplicate title tags" and "Duplicate meta descriptions"

Rlilly, It occurs that there may be another issue entirely... As you've framed your question, I think you've gotten good answers on this thread, but you may not have described the problem fully.

What are the circumstances of the change? You need to find the origins of the problem.

Eg, if you've made any changes to the site recently, then these might be...

a) simply changes in url naming conventions...

b) or, they might be be old versions and new versions of pages built on the same page framework, and also with slightly different naming rules.

If the latter, the content will be not be the same... ie, it's not safe to assume the only changes in these pages is a difference in title case. Because Google is case sensitive with regard to urls... it would report the different capitalization/lc combinations as entirely different pages, and just report top level duplications of titles and meta descriptions.

If these pages have different content, then the 301s might create even more confusion, as they could overwrite changes. Before doing anything, you should look at some of the content of these pages as well, and compare... and track back to the origination of the problem. If you had version control, and this is the issue, that would be immensely helpful.

Otherwise, to help track which pages overlap, use a spidering tool... or do site searches for unique phrases on the page (ie, search for them in quotes). There's a huge variety of possibilities, depending on what your work flow was.

I hope this makes sense. It's sometimes hard to deal with hypothetical situations which haven't been described in sufficient detail.

11:11 am on Apr 27, 2018 (gmt 0)

New User

5+ Year Member

joined:Feb 17, 2012
posts: 9
votes: 1


Hello Rlilly,
In one of the hangouts, John Mueller once said capitals in URLs don't matter. Google handles this efficiently.

Also, if URL is only different but content remains the same, 301 is the only option. Canonical can't be suggested as it should only be used when you want to keep both the pages live and available to users for specific reasons. But in this case, it seems that these URLs come up due to some error or mistakes in coding or something like that. Thanks.
7:52 pm on May 16, 2018 (gmt 0)

Full Member

10+ Year Member

joined:July 26, 2006
posts:323
votes: 11


Thank you for the replies. They were the same information on both pages. We did 301 redirects. We still exploring why the error occurred.