Forum Moderators: phranque

Message Too Old, No Replies

Duplicate content issue with dynamic sites

         

Marcia

10:41 am on Aug 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This seems to be a recurring problem, and is getting more serious since lately it's been causing problems for a lot of sites with their Google indexing. Here's one currently being discussed in the Google forum - more than likely due to this phenomenon

[webmasterworld.com...]

Here's a thread I started a couple of weeks ago regarding an ASP site

[webmasterworld.com...]

And it's come up again, this time with two different Content Management Systems, both written in PHP and both using search engine friendly URLs and mod_rewrite

[webmasterworld.com...]

It does not seem to be restricted to one particular platform or language, and will be getting more and more dynamic sites into deep trouble, not even knowing until it's too late, unless there's some insight into a solution for it. Are there any arrows pointing to a solution, or is the only solution to avoid dynamic sites altogether, or to have to exclude databases from being indexed and laboriously hand roll content pages?

Birdman

11:56 am on Aug 22, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>>>is the only solution to avoid dynamic sites altogether

I don't think so. The problems lie in the coding of the content management systems, not the ideal of dynamically generated websites.

I think the key is to 'roll your own' CMS or have a good programmer do it for you and always carefully check all the links on the pages before calling it a website and going live.

Yes, I know that could be cumbersome for some of these large sites with thousands of pages but it really isn't. You really just need to test a percentage of the pages for broken/dupe links. I usually dig down in the directory(fake) structure, checking all links along the way. Do this a few times and if you haven't found any problems, you should be fine.

Some of the common errors are:

1) upper/lower case versions of the same link
/San-Antonio/
/san-antonio/

2) mixing underscores and dashes
/San-Antonio/
/San_Antonio/

3) relative paths that get stacked
Imagine you're at /San-Antonio/widgets.php and you have a link on the page like <a href="Washington/widgets.php">

You end up at /San-Antonio/Washington/widgets.php

Not good!

Either way, it's the programmer's fault so avoiding dynamic websites altogether is not the best solution. Just use some of the time you save with dynamic sites to peruse your links for errors and dupes.

Another thought is to use a link indexer on your site and get a total count of unique URLs. Then query your database to get a count of how many there SHOULD be. If the numbers are way off then you know you have a problem.

Marty

Reflect

3:08 pm on Aug 23, 2004 (gmt 0)

10+ Year Member



avoid dynamic sites altogether

Loooong day here allready, excuse me if I missunderstand.

I have seen a lot of posts as of late on duplicate content due to aff. data feeds. It seems if you manipulate the feed so that the conetent is a tad more unique it seems to help boost it back up in the SERPs.

I have not verified/done this as my aff. sites don't use feeds. I have read this several times over though.

Take care,

Brian

RonPK

12:19 pm on Aug 24, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is this about /news.php?id=123 and /news/123.html pointing to the same content?

I'm working on a couple of CMS-ish projects which should only use SE- and user-friendly URLs. It really isn't that hard to do; all the programmer has to do is make sure to always run local hyperlinks through a conversion function before sending them to output.

So IMHO it's all about keeping the head clear...