homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

Duplicate content issue
wondering how to get around this problem

 6:00 pm on Jun 2, 2009 (gmt 0)

I'm launching a new site with articles that can be accessed from different sections, which gives them different urls. Say, for example, an article about German Shepherds might be:


It's a database-driven site and the "large-dogs" or "friendly-dogs" is used as a link on the article page to link back to a section on the site where the visitor was just browsing, so I need to pass that info.

I'm guessing this may be seen by Google as different urls with the same content and get a penalty, anyone have advice for how I can get around this issue? Or is Google smart enough and I'm worrying about nothing?



 6:18 pm on Jun 2, 2009 (gmt 0)

If it's truly exact content, then you should just use one or the other, not both.

However, using your example, not all large dogs are friendly dogs (and vice versa), they're probably not going to be exact duplicates- just some overlap between them. Depending on the extent of non-duplicated data, Google is probably smart enough to figure things out.


 6:34 pm on Jun 2, 2009 (gmt 0)

Don't include the search route the user took to get to the content, within the final URL.

That is, by all means have index pages for:

which list various dogs, but make the final dogs page something like:

The dog page itself can then have links that point back to "find other large dogs" and "find other friendly dogs".


 6:57 pm on Jun 2, 2009 (gmt 0)

I understand what you're saying Jim. The problem is that the articles each have a link back to the section the user was just browsing that is built using a URL variable, ie friendly-dogs. If I make the search route static for each article, how can I pass the section name that they were just browsing?

My fallback is to follow your advice and just grab the primary section for the link out of the database, which would give the user the right section MOST of the time, but it would be much nicer if the link were right every time.


 7:03 pm on Jun 2, 2009 (gmt 0)

No. Link back to *all* the section index pages that the page can be found under. The one they were previously on will be highlighted by the browser, as a visited link.

Look at how some of the best e-comm sites do this:
- find more of this colour
- find more of this size
- find more of this type
- find more of this brand

It is the ultimate cross-sell cross-linking. You don't know where the person on that page wants to go next. Make it easy to go wherever they want. Don't make them have to go all the back to the home page and start a new category drill down process.

[edited by: g1smd at 7:09 pm (utc) on June 2, 2009]


 7:08 pm on Jun 2, 2009 (gmt 0)

I could do that, was also using the section name as part of a more keyword-targeted url, ie www.dogs.com/large-dogs/german-shepherds as opposed to just www.dogs.com/german-shepherds.

Doesn't seem that there's much I can do about it though, I'm def more worried about SE penalties for dup content than the extra I'd get from a better url.

Thanks again for the help, I'm learning a ton from your posts here.


 7:10 pm on Jun 2, 2009 (gmt 0)

You'll get a better benefit from all the different incoming anchor text links pointing to the page, than you will from single-keyword-in-URL.


 7:11 pm on Jun 2, 2009 (gmt 0)

Oh, and about the cross-linking comment, it's not really necessary since I actually have a big nav menu on the left side of each page, the link at the top of the article is just a keyword-targeted link to pass juice up to my main sections (ie large dogs, friendly dogs) which are the primary site keywords.

Ok, that didn't come out right, what I meant was it's kind of a breadcrumb and also kicks link juice back to specific section pages.

[edited by: sdguy at 7:14 pm (utc) on June 2, 2009]


 7:12 pm on Jun 2, 2009 (gmt 0)

You're right, was just trying to cover all the bases. Thanks again!


 7:22 pm on Jun 2, 2009 (gmt 0)

Sure you can have a breadcrumb trail showing the route the user took this time, but don't stint yourself here. Link to the other indexes where this detail page would be listed. The breadcrumb trail doesn't have to "build a longer URL" as you go deeper into the site.


 10:13 pm on Jun 2, 2009 (gmt 0)

Jim, I have one more mod_rewrite issue that I can't figure out. I'd like to send users to a specific place on page, ie someone adds a comment on German Shepherds and when they submit the page refreshes and they're brought to their comment. The problem is that # in mod_rewrite is used to comment things out and I've tried to escape it with a \ but to no avail. I did some searching online but can't find an answer, is it possible?


 10:41 pm on Jun 2, 2009 (gmt 0)

Where's Jim?

You add the #part in the link on the page that the user sees and clicks, or as a part of a meta refresh (perhaps on an interstitial page), as it is the browser that resolves that. There's nothing to be done in .htaccess for this.


 10:58 pm on Jun 2, 2009 (gmt 0)

You're not Jim? I could've sworn someone referred to you as Jim in that last thread, sorry!

Ok, I'm confused. Normally I would direct a user to an anchor point on a page with something like www.dogs.com/breeds.htm#german-shepherd, but with mod_rewrite a pattern is used and matched (ie /breeds) and then mod_rewrite loads the preset page (breeds.htm). I tried to get it to load breeds.htm#german-shepherd but it ignores everything after the #, is there any way to escape the # in mod_rewrite?


 2:35 pm on Jun 3, 2009 (gmt 0)

See mod_rewrite RewriteRule [NE] flag. That, plus possibly escaping the # -- a la "\#" in the substitution URL might work.

Two notes:

1) Of all the browsers in existence, only Apples' Safari sends URL-fragment identifiers to the server when it makes a request. So rewriting from a URL with a specific fragment identifier only works with Safari. Rewriting to a specific on-page named anchor can work, but...

2) The 'urgent' need to use and to redirect to named anchors on a page is a strong indicator that that page needs to be broken up into two or more more-specific-subject-oriented pages. This benefits both site usability and SEO factors, as each page is then more tightly focused.



 5:18 pm on Jun 3, 2009 (gmt 0)

Thanks for the reply Jim. In this case it's not an urgent need to use anchors and the page doesn't need to be broken up. You see I'm adding a thumbs-up/down to suggestions that users can make about articles and when someone gives it a thumbs up or down I'd like to return that user to the suggestion they thumbed. I know this is a good use of ajax but I don't have a clue about it. :)

I've tried using \# and it didn't work, I'll look into the [NE] flag.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved