Forum Moderators: open
One of my clients has a 2,000 page website and has begun developing another domain. But some <expletive> took a good chunk of the copy intended for the new domain and put it on the old domain.
So that content is now indexed and showing up in the SERPs - and their content editor is being more than a bit dense about allowing it to be removed/replaced.
When the new domain launches in a few weeks, there's no way I want it to compete against duplicate copy on the old PR7 domain.
In this situation, would you suggest:
1. robots.txt
2. robots meta tag
3. Google's removal process
4. some combination of 1,2,3
5. something else altogether
5. nothing at all
It's a bit too complex for a simple re-direct. The outlaw pages were placed in a highly integrated position in the site's information architecture - so we don't want to send those visitors to a different domain, we want them to have the full nav template for the existing domain.
Have you heard or seen problems using Google's "remove" process?
This would save on bandwidth (not an issue I expect) and will (eventually) remove the problem of near-duplicate pages. It would still leave Google cluttered with URL-only listings if there are links to the pages.
> 2. robots meta tag
Although you have to wait for the page to be re-crawled, this completely removes the listing from Google.
> 3. Google's removal process
This would be time consuming, and I would consider it overkill for cases where these isn't some legal/embarrassment issue with the content.
> 4. some combination of 1,2,3
The removal process requires either /robots.txt or robots meta exclusion.
There is no point using the robots meta tag is you have /robots.txt as the URLs will never be fetched and Google won't know to exclude them (as URL-only listings) from the index.
> 5. something else altogether
A 'This page has moved' page, with a large text link. You'd get to keep the PageRank (losing only one thirtieth of one notch on the Toolbar) but it's an inelegant approach and some small percentage of human visitors would not follow the link.
I'd use 301 redirects. Usually, Google do the right thing and list the redirect destination URL, assigning it the links and PR of the redirect source URL.
`
> 5. nothing at all
Not a disaster. The 'duplicate content penalty' is a myth IMO but you may find the pages on the old domain listed instead of the pages on the new domain. It fails to give the new site the link benefit of the old.
GG has stated:
"If a page is in robots.txt, we won't crawl it, but we can still return it as a search result if we have good evidence that the page is relevant to a query. In this case, we'll return just the url (no title and no cached page because we didn't fetch the page itself).
If you don't want the page to show up at all, you can guarantee that by letting Google see the noindex meta tag by fetching that page."
Regards
Ray
EW
The main job right now is not to let the large, established site grab the rankings that we want to see for the new domain. I have no concern about a 'penalty' as such, just the competition where Google may choose only to list one domain.
Eventually (who knows how soon) we'll replace that content that should never have been published in the first place and then lift the robots meta tag (that's all I think I'm going to do here.) Probably we'll create new file names for the new content.
This was a strange one - the so-called content creator somehow found this content on the organization's network, buried in their CMS somewhere. He didn't know why it was being developed, and probably thought it was an abandoned project. So he published it claiming he wrote it, and the content manager didn't know any better. He was paid for it, but is no longer on the job.
All this proves my theory about CMS - no content management system can be any better than the content manager. And many times it's a better manager, and not a better CMS, that is the real need.
By the way - this is over 100 pages worth of content, essentially most of the new website which has been painstakingly developed over 18 months. It's intended to be an important site for this organization going forward for many years, and tied into print and broadcast marketing campaigns.