Forum Moderators: goodroi
I have two different sub domains that are mirrors of each other.
www.site.com = edit.site.com
I would like limit search engines from crawling the edit sub domain, but if I add a robots.txt to edit.site.com/robots.txt it will also go to www.site.com/robots.txt
Are there any commands I can add that will apply to just the edit sub domain?
Thanks
edit.site.com/robots.txt and www.site.com/robots.txt are entirely separate, and would not confuse any serious SE.
having said that. you'd do much better all round to close edit.site.com and 301 to www.site.com
Easier, safer, much better for you, your site, your visitors and your serps.
There are better ways to setup the server.
Unfortunately I don't have access to change the setup to this website, but the robots.txt could be changed.
Since the two domains mirror each other, the robots.txt on one domain will mirror the other.
Is there any way robots.txt could be setup to exclude certain subdomains or full paths?
Thanks
Please note, I rewrote the PHP code from what I use to make it more easily readable. So, it should be considered untested code (although I did do a basic test).
.htaccess
RewriteRule ^robots\.txt$ robots.php [L]
robots.php:
<?php$robots = array(
'https://dev.example.com' => 'User-agent: *
Disallow: /',
'http://dev.example.com' => 'User-agent: *
Disallow: /',
'https://www.example.com' => '
User-agent: *
Disallow: /',
'http://www.example.com' => '
User-agent: *
Disallow:',
'default' => '## defaultUser-agent: *
Disallow: /'
);// Check for SSL
$s = '';
if ( strtolower($_SERVER['HTTPS']) === 'on' ) { $s = 's'; }// Concatenate index key
$match = 'http'.$s.'://'.$_SERVER['HTTP_HOST'];// Set default value
$robotstxt = $robots['default'];// check for better value
if(isset($robots[$match]) &&!empty($robots[$match])){
$robotstxt = $robots[$match];
}// get an accurate Last-Modified time
$file_lastmod = getlastmod();
$header_lastmod = gmdate("D, d M Y H:i:s", $file_lastmod);// send headers
header('Last-Modified: '.$header_lastmod.' GMT');
header('Content-Type: text/plain; charset=UTF-8');// output the robots.txt
echo "## robots.txt for ".$match."\r\r";
echo $robotstxt;
?>