homepage Welcome to WebmasterWorld Guest from 54.197.211.197
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Allow google to crawl on only primary domains
How to achieve exclusion on multiple domains using obj.conf & robots.txt
justlonging1

5+ Year Member



 
Msg#: 3353374 posted 11:45 am on May 30, 2007 (gmt 0)

I have host based domain name configuration on Iplanet Web server. I have few primary domains and many secondary domains pointing to the same web server.

How could I control and allow just google to just index only primary domains (e.g www.primarydomains.com/robots.txt) and not around 300 odd secondary doamins using obj.conf and robots.txt. I have just one document root directory under which robots.txt reside.

How could I typically achieve below this using obj.conf (iplanet) and robots.txt:

1. All primary domain could access robots.txt_a which will have a rule that allows only google to crawl.
2. All the secondary domains could access robots.txt_b which will have rule which blocks all crawlers.

 

Quadrille

WebmasterWorld Senior Member quadrille us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3353374 posted 12:06 pm on May 30, 2007 (gmt 0)

Remember that if you are adding the robots.txt to already-existing sites, it is not a perfect way to make a site invisible; this is especially true if Google is already aware of links between such sites.

You may need to also consider the 'removal tool' - but that, too can have disadvantages.

What exactly are you trying to achieve?

justlonging1

5+ Year Member



 
Msg#: 3353374 posted 12:35 pm on May 30, 2007 (gmt 0)

I am looking at achieving
1. Primary domains abc.com,cde.com having a common doc root should read /robots.txt_a which has a rule allowing googlebot to crawl.
2. All secondary domains xyz.com (actually an alias of abc.com) to read /robots.txt_b which has a rule to block all crawlers.

This is to achieve consolidation of seach results going forward and to improve search on primary domains.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved