| How to keep dup content from sub domain out of index
|
francie brady

msg:4117040 | 6:03 pm on Apr 16, 2010 (gmt 0) | I am faced with this situation: The asp site has a main domain and a sub domain providing the same content. The sub domain is needed to provide particular affiliates links coming in with a page that contains the appropriate 'buy through this network' information... other than that, the pages are the same as on the main domain. If we put a disallow on the subdomain robots.txt file could Google not 'find' and index the dup content via external links? We could put a disallow on the subdomain robots.txt file AND rel canonical tags on each of the sub domain's pages showing the original content as being the main domain's pages - would that be the way to do it? Thanks for helping :-)
|
g1smd

msg:4117171 | 12:40 am on Apr 17, 2010 (gmt 0) | The robots.txt disallow stops the pages being crawled, so their content isn't fetched. The URLs can still turn up in SERPs as URL-only entries. Adding the meta robots noindex tag to the pages allows them to be crawled, but nothing about those pages will turn up in SERPs, not even the URL. The rel="canonical" tag would act as a hint to Google to index the other URL version. They don't have to follow that hint, but usually do. Use just one method. You cannot combine them.
|
francie brady

msg:4117331 | 3:15 pm on Apr 17, 2010 (gmt 0) | Thanks for replying g1smd :-) It appears, if I dynamically (it's an asp site) "add" a meta no index tag to all the sub domain pages - that should do it? I sure don't want to lose the indexing of all the main domain's pages, which are identical. However the main domain's pages wouldn't have the dynamically added no index meta tag. Does this sound ok? I'm nervous about suggesting an implementation like this unless I'm sure it's correct :-) Thanks :-)
|
|
|