homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

How to deal with the Duplicate Content Factory?

10+ Year Member

Msg#: 31337 posted 4:19 pm on Sep 23, 2005 (gmt 0)

The duplicate content factory (previously known as Google) has been working overtime on my most important site.

In the past few days I have encountered a wide range of directory and filename modification covering changing to all lowercase, changing to all uppercase, and selective first letter to uppercase (thought you were safe with all lowercase, huh!). I've also found directories that are nested at the same level on my server, nested one within the other in the DCF index! What a mess!

With a 3 tier directory, this adds up to a lot of duplicate content given the range of variations used. If the DCF thinks this requires a penalty, and I certainly do, it should start beating itself with a very big stick right now!

If you're going to claim to be a search engine and take content for free from other companies websites, then the very least you can do is to ensure that you represent those companies websites accurately in terms of content and structure.

Although time consuming (and at the expense of improving the site for visitors), the DCF's creative input to my directory and file names is relatively easily dealt with, but the real problem that I could do with some advice on is how to deal with filename repetition and filename appending in the DCF index.

This is the kind of thing I'm seeing:


plus it will append a querystring when it feels like it.

Neither cgi.script_name or cgi.query_string will detect the trailing forward slash after the first filename. The ones with a query string I am able to nail but I cannot see any way at present to deal with 2 filenames seperated by the forward slash.

I'm on an NT/CFM server, and in all other cases I'm checking the script_name against what I know it should be and delivering a page with a robots meta tag when they don't match.

Any ideas on how to detect a consecutive filename combo?


Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved