Forum Moderators: phranque
The problem is a site that has had both www and non-www versions spidered and added to indexes. Given that SE's reactions to 301s, 404s and 410s are many and varied, would this do the trick better?
RewriteCond %{HTTP_HOST} ^example\.com
RewriteCond %{REQUEST_URI} ^robots.txt
RewriteRule ^robots\.txt /goaway.txt [L]
Where goaway.txt would contain:
User-agent: *
Disallow: /
I always set up the non www. version as a different virtual server in apache and put a .htaccess file in the document root
Redirect 301 / [yourdomain.co.uk...]
RewriteCond %{HTTP_HOST} ^example\.com
RewriteCond %{REQUEST_URI} ^/robots\.txt
RewriteCond %{HTTP_USER_AGENT} slurp
RewriteRule ^robots\.txt /slurp_robots.txt [L]
RewriteCond %{HTTP_HOST} ^example\.com
RewriteCond %{REQUEST_URI} ^/robots\.txt
RewriteCond %{HTTP_USER_AGENT} (Googlebot¦msnbot¦Slurp¦Teoma)
RewriteRule ^robots\.txt /%1_robots.txt [L]
Jim