Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Unintentional Website Duplication

         

languageusa

1:07 am on Aug 29, 2007 (gmt 0)

10+ Year Member



Over two months ago I moved MysiteDotCom to a new host. This host uses cPanel control panel on their shared Linux plan. As part of the setup routine, I was provided with an alternate URL (due to the way cPanel is setup) allowing me to directly access my website based on the server URL: ServerDotHostnameDotcom/~username

Something went wrong with the initial account setup and the following occurred:

(1) If one enters my server based URL ServerDotHostnameDotcom/~username, it comes up with Google’s PR6, which is the PR for MysiteDotCom.

(2) If one checks Google’s cached snapshot of ServerDotHostnameDotcom/~username, the snapshot of MysiteDotCom would be displayed and the note would read: “This is Google’s cache of MysiteDotCom as retrieved on [date/time].

(3) Backlinks for ServerDotHostnameDotcom/~username would show the backlinks for MysiteDotCom

(4) The info command for ServerDotHostnameDotcom/~username would show the results for. MysiteDotCom.

Two months ago, soon after MysiteDotCom was moved to that host, Google imposed a penalty on MysiteDotCom – it disappeared from SERPs for major keyword terms. The problem I described above could be the couse as an unintentional website duplication was created.

What course should I take?

Thanks!

reprint

4:17 am on Aug 29, 2007 (gmt 0)

10+ Year Member



Assuming that it was a duplicate content issue, 301 redirect ServerDotHostnameDotcom/~username to MysiteDotCom.

languageusa

11:23 am on Aug 29, 2007 (gmt 0)

10+ Year Member



> 301 redirect ServerDotHostnameDotcom/~username to MysiteDotCom

I can place .htaccess with the 301 on ServerDotHostnameDotcom/~username, but it is not (and was not) visited by any robots and it has no incoming links to it. How can I redirect otherwise?

reprint

2:25 pm on Aug 29, 2007 (gmt 0)

10+ Year Member



Ok, if it hasn't been visited by bots, then you cant have a duplicate content issue. A bot would have to have seen two or more urls with the same content and keep one url and drop the others.

tedster

4:55 pm on Aug 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The bots do not directly download your .htaccess file. Bots request a url from a domain, and if the server has an applicable .htaccess file, then it uses those rules to craft its response to the bot.

reprint

5:30 pm on Aug 29, 2007 (gmt 0)

10+ Year Member



You say for ServerDotHostnameDotcom/~username there is a cache, backlinks etc in google so it sounds to me like ServerDotHostnameDotcom/~username was indeed visited by bots and was indexed and MysiteDotCom was dropped as duplicate content but thats just a guess based on what you have said. Someone else might have another idea.

The steps i would take would be to 301 redirect as i said, make sure your site is in webmaster tools, the preferred address set and a sitemap submitted. You can also put a noindex in robots file for ServerDotHostnameDotcom/~username in case you have an inadvertent link.

it may take time for ServerDotHostnameDotcom/~username to get dropped.

Tedster,
I think they meant that ServerDotHostnameDotcom/~username wasnt visited by bots, not the .htaccess file

g1smd

7:29 pm on Aug 29, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You will also need to fix www and non-www issues and make sure that there are no other indexing holes that might be found by the bots.

languageusa

8:37 pm on Aug 29, 2007 (gmt 0)

10+ Year Member



REPRINT:
> You say for ServerDotHostnameDotcom/~username there is a cache, backlinks etc in google so it sounds to me like ServerDotHostnameDotcom/~username was indeed visited by bots and was indexed and MysiteDotCom was dropped as duplicate content but thats just a guess based on what you have said.

There is a cache for ServerDotHostnameDotcom/~username in Google, but instead of ServerDotHostnameDotcom/~username's cache, a cache for MysiteDotCom is being shown. That's my problem. Same with backlinks for ServerDotHostnameDotcom/~username -- the backlinks for MysiteDotCom are shown. Same with PR. ServerDotHostnameDotcom/~username comes up with PR6 -- that's MysiteDotCom's PR. Any info you request about ServerDotHostnameDotcom/~username in Google, presents the info belonging to MysiteDotCom. ServerDotHostnameDotcom/~username has existed for just several weeks (the domain MysiteDotCom, on the other hand, is 11 years old), it has no incoming links or anything. It does not exist separately from MysiteDotCom. The problem is that Google thinks that ServerDotHostnameDotcom/~username is, in fact, MysiteDotCom and when asked about ServerDotHostnameDotcom/~username reports the info belonging/related to MysiteDotCom. How to get out of this mess? How to completely drop ServerDotHostnameDotcom/~username? As I said, I am definetely being penalized, and this mess could be the problem.

reprint

9:43 pm on Aug 29, 2007 (gmt 0)

10+ Year Member



All the advice you are getting in this post and others for this issue still stands.

Ok, lets go through it. For a period of time ServerDotHostnameDotcom/~username was online and probably acting as your website or at least appeared so to the bots that visited during that time as well as MysiteDotCom. Why this happened i dont know: "something went wrong with the initial account setup" (maybe DNS?) but it wouldnt surprise me if you had the files from MysiteDotCom in the ServerDotHostnameDotcom/~username directory, right?

When the bot sees 2 urls with the same content, it has to choose and drop one. It sounds like at least for a period of time that it chose ServerDotHostnameDotcom/~username and dropped MysiteDotCom as duplicate. The why doesnt really matter at this stage.

The problem may already be resolved or about to be as the cache takes time to clear as does toolbar PR if thats what you are talking about. Again the steps outlined above should help address those issues if they were the cause of the penalty.