Forum Moderators: open
I need to update the html of this .au site, and plan to use the same basic html as my real (main) site. Trouble is, I don't want it to do too well and set off a Spam algo! Rather then even contemplate this, I wondered if there was something I could put in the robots.txt file (or something else, somewhere else), that would allow me to keep the PR without having Google index the mirrored site's pages. And without affecting my standing with the Aussie search engines that created the need in the first place.
I suspect I am OK, as many sites have regional copies (including Google). And I think that Google simply downgrades / ignores what it sees as the copy (which is not that hard to work out). But, nervous as ever, I wondered if there was a more assured way.
But a deny would presumably cost me the PR I get from the mirror. And simply changing the html a tad will probably get me Spam reported if the mirror site then does well (although, based on what I have seen, I do not think Google could object to it).
This is one of the few times I have considered, seriously, using a cloaking script. To send Google directly to the real site. I am not going to, as I am paranoid about cloaks. I had hoped the .htaccess file could be configured to send certain bots elsewhere. Or that there was a meta tag I could use that would stop Google from indexing page (but allow them to keep my luvly PR bonus going).
I had hoped the .htaccess file could be configured to send certain bots elsewhere
It can, but that would be considered cloaking. And I don't see how it would help you anyway.
I think you have to either block the mirror from Google entirely, or risk duplicate content.
Personally, I don't think Google would penalize you, if it's obviously a regional mirror. They might just drop pages that are exactly the same.
It is unlike that Google will ever penalize you with duplicate content > avoid linking "crosslink" same content to same content and you will be find > and no need to add robots.txt, robot no index or .htaccess.
However: look to the future... don't get in the habit of adding new content to both sites "always"... develop some uniqueness between them
e.g. - add a new page to 1 and link to it from the other, then another new page and reverse the link.
While under Google's radar things are great but at some point your sites successfullness will put you in Google's crosshairs, so avoid total duplication from here on out.