Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Will a 503 Response Fix This Duplicate URL Problem?

         

stateless

4:50 pm on Oct 9, 2006 (gmt 0)

10+ Year Member



I have a site of around 1000 pages and as I'm sure it often the case, I have come to rely on google traffic to an extent. Sadly during the recent data refresh my site was badly hit. Previously, in the update of a few months ago, I drifted from place 4 to around 12 for my most important keyword combination, thought only for around a day . This blip aside my position for the combo has been improving for two years in a category where there rarely seems to be sudden movements.

This time however, I've had no such luck, and I've moved from 4th to 15th position for the keyword combo, where all others appeared to have remained in place (lucky them). I am aware that this could be due to a number of factors, but have also now discovered that I have a considerable duplication problem and think that this is possibly the prime suspect in my slip down the rankings.

The problem is this, yes I have the usual www.blah.com blah.com duplication problem, but I also have one that's made the problem much, much worse. It seems that because of the way my nameserver has been setup any prefix on my domain name leads to the same content. I can illistrate this with another site I've found with the same issue:

http://www.example.com/page.html

http://httpwww.example.com/page.html

http://w2ww.example..com/page.html

basically anything you place before the domain leads to the same page. This is a problem waiting to happen, as in the case of my site I assume that on message boards etc occasionally people get the 'www' wrong and as a result google spidered hundreds of pages which are duplicate content. I have changed the nameserver settings so now anything other than www.example.com and example.com give an error, and in google sitemaps I've set my prefered domain as 'www' so I assume the www non-wow issue will now be able to resolve itself.

Due to the way my nameserver is set up it seems to be difficult for me to get a 404 in place (http://gsitecrawler.com/tools/Server-Status.aspx lists my error as Result code: 503 (ServiceUnavailable / Service Unavailable) so I am wondering whether this will suffice? I am aware that I could instead put a 301 error in place, but do I really want to confuse google by stating that 3 different suffix (ww. , w2w. and so on) all lead to the same page. Couldn't that result in a duplication problem in itself? Is there any way I can remove the duplicate content to minimise the damage?

[edited by: tedster at 5:56 pm (utc) on Oct. 9, 2006]
[edit reason] use example.com [/edit]

g1smd

6:45 pm on Oct 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



A 301 redirect will remove all the duplicates by saying "that content isn't here". The other stuff will show up as Supplemental for a while, but your redirect will get the visitor to the correct place and halt the propagation of incorrectly formatted URLs.

I rarely use this particular form, but for you, this time, it is the correct one:

RewriteEngine on

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/ [NC]
RewriteRule ^(.*)index.html?$ http://www.example.com/$1 [R=301,L]

RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

First, this forces all index pages, both index.html and index.htm to "/" for both non-www and www, and forces them all to be on www. The redirect works for index pages both in the root and in any folders, and the redirect preserves the folder name in the redirect.

Secondly, for all pages that are not expressly on www the other redirect forces the domain to be www. This second directive is never used by index pages as the first directive will have already converted all of them. This directive works for all URLs that do not begin www...

This second directive is the most important part of fixing the problem. The single ! is vital in that code.

stateless

8:32 pm on Oct 9, 2006 (gmt 0)

10+ Year Member



Thanks very much for the advice. I will impliment it, and hopefully this will solve the problem. Out of curiosity though can you tell me what you think would happen to the duplicate pages if I left them as they are (with the 503 error being displayed)? Would they eventually disappear anyway?

Also, a possible devlopment here. I talked about how my results suddenly re-appeared during the last google algo change a few months back. Well, just now for the first time in a good few days I've noticed that I'm back up in a few of the datacentres:

216.239.57.147
216.239.57.99
216.239.57.104
66.249.91.107
66.249.93.107

For me at least these ones appear to be displaying slightly different results, and all in all there seem to be around three different sets of results (on mcdar). As has been said previously, this isn't over yet.

g1smd

8:41 pm on Oct 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I have no idea what Google will do with a 503 response, but I do know what they do with 200 (index), 301 (de-index), and 404 (de-index). And 302 is always risky.

By the way, the data across the various IPs within one Class-C block is almost always identical, so I would just check a few Class-C blocks and stick to just one IP on each, e.g. x.x.x.104 or x.x.x.99 or your choice [webmasterworld.com].