Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Duplicate Content or August Update?

         

fritzbayer

12:59 pm on Sep 9, 2007 (gmt 0)

10+ Year Member



Hi,

I have been running a website for a couple of years now. It ranked well for it's main and relevant two keyword combinations.

The main domain carries most of the backlinks, and looks something like "www.word1word2.com", but there are other domains like "www.word1-word2.de" (with a minus sign) as well.

All across the website I link to internal pages starting with a dash followed by the page name, but leaving out the domain name. So all my urls look like this:

/info/contact.html
/help.html
/index.html
/offers/59385.html

Until recently, When I used to ran the site: command on each of my domain names I would get identical search results. So for example

site:www.word1word2.com
site:word1word.com
site:www.word1-word2.com
site:word1-word2.com

would all yield the same search result. I always wondered if that could be a problem, because I read about duplicate content, but google always understood that www.word1word2.com (without the minu
s) is the main domain name and that all other are just supplements.

In the middle of august I added a new html page. This page I referenced from the homepage, but I forgot to prepend the dash (/). So I was linking from the homepage to the new page as follows:

cooperations.html

instead of

/cooperations.html

A few days later all subpages of the main domain (www.word1word2.com) disappeared. The only two pages which currently show up in the search result, ist the homepage and the newly added page (cooperations.html). All other pages do not show up anymore. So a site:www.word1word2.com yields:

www.word1word2.com/
www.word1word2.com/cooperations.html

So I wonder, if me having forgotten to add the dash in front has triggerd all those pages disappearing for the main domain name. Because, running a site:www.word1-word2.com or site:word1-word2.com still shows all the pages, which have gone missing. However, they rank much worse.

So my question is: Do I have a problem with duplicate content or is that what happend to me what happend to many other people in august (I have read that pages basically disappear for no reason)?

g1smd

5:48 pm on Sep 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You have Duplicate Content in spades. You got away with it until now. We'll likely never know what triggered the change in indexing.

The only way to fix this now, is to set up a set of site-wide 301 redirects so that only one canonical domain can ever be indexed.

I always use root relative URLs for links and let the redirects sort out the correct domain for indexing if the wrong one is entered.

I often use links to www.domain.com/, without mentioning the index file filename in the link to make sure things start off right from the correct root.

fritzbayer

6:53 pm on Sep 9, 2007 (gmt 0)

10+ Year Member



You have Duplicate Content in spades. You got away with it until now. We'll likely never know what triggered the change in indexing.

Well the only thing I changed was adding this new website, but leaving out the dash. So that must have triggered it, because the site has been in this state since two years.

The only way to fix this now, is to set up a set of site-wide 301 redirects so that only one canonical domain can ever be indexed.

I will do that and let you know if it fixes the problem. How long does it usually take until things are back to normal?

I always use root relative URLs for links and let the redirects sort out the correct domain for indexing if the wrong one is entered.

What do you mean "root relative URLs"? Could you give me a couple of examples?

I often use links to www.domain.com/, without mentioning the index file filename in the link to make sure things start off right from the correct root.

Do you mean that on your website you always link to www.yourdomain.com/ and not to www.yourdomain.com or www.yourdomain.com/index.html?

g1smd

7:15 pm on Sep 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The Duplicate Content comes from having multiple domains, and having both www and non-www active, and all returning a "200 OK" status.

The redirect will fix it. The correct URLs should be fully indexed within a few months. The unwanted URLs will continue to show (as Supplemental Results) for up to a year. THAT is not a problem as they will still deliver visitors to your sites, and your redirect will take them to the correct URL.

.

Root relative URLs start with a "/" and do not specify the domain. They are absolute within a domain. I use those. The domain does need to be specified somewhere, otherwise all four variations can become indexed. The redirect fixes that.

Fully relative URLs, like "somefile.html" (without a "/"), or "../../otherfile.html" (with relative ".." paths) are very problematical. Do not use those.

Absolute URLs specify the full domain and the full server path and filename. Those are the best, but use too many characters. They bulk up the code too much. Root relative combined with the redirects can do the same job.

.

Link to "/" or to "www.domain.com/" for the root index page. I prefer the latter as it channels the correct domain. I never include the index file filename in any link. Including the filename is creating Duplicate Content. Including the filename in the link also creates problems when you change technology at some time in the future.

This thread may be useful: [webmasterworld.com...]