Welcome to WebmasterWorld Guest from 54.227.101.214

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Controlling a Subdomain with robots.txt

robots.txt for multplie subdomains

     

ivanvias

8:13 pm on Jan 14, 2011 (gmt 0)

5+ Year Member



Hi,

I have some rules that create virtual subdomains which is working fine.

I have a main domain and a robots.txt file that i would also want to work with every subdomain.


Can anyone assist.

Would this work:

.htaccess

# Uncomment the following line if rewrites are not already enabled
# RewriteEngine on

# Use a special robots.txt file for the subdomain
RewriteCond %{HTTP_HOST} ^*.example.com$
RewriteRule robots\.txt robots.txt [L]

?

g1smd

9:03 pm on Jan 14, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



In order to work for any particular hostname, the robots.txt file must appear at subdomain.example.com/robots.txt when accessed from the web.

The code above rewrites a request to itself in an infinite loop. Actually it would never work as the ^*. syntax is invalid.

In a rewrite, the pattern should match the path part of the incoming URL request, and the target should be the physical server path and filename where that content resides.

wilderness

9:22 pm on Jan 14, 2011 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



g1smd,
I've explained previously that these regex's are not my forte, and the longer I fail to use them, the even less I understand them.

Would something like this work?
With a robots.txt file previously installed in each subdomain-directory,

RewriteCond %{HTTP_HOST} (.+[^/])/(.+[^/]).example.com
RewriteCond %{HTTP_HOST} !(.+[^/])/(.+[^/]).example.com
RewriteRule robots\.txt [(.+[^...] [L]

Thanks in advance.

Don

g1smd

11:51 pm on Jan 14, 2011 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The question is still a little unclear, but I am guessing you want an incoming request for foo.example.com/robots.txt to be served by the file located at /foo/robots.txt inside the server - where "foo" matches both the sub-domain name and its respective folder - one folder for each sub-domain.

This is different to the original question where I believe you said there would be just one single robots.txt file for all of the sub-domains. Please clarify that, as it makes a vast difference. In phrasing the question, note that URLs used "out there on the web", and filepaths used "inside the server" are not at all the same thing. They are "related" by the actions of the server and its configuration.

ivanvias

2:28 am on Jan 15, 2011 (gmt 0)

5+ Year Member



Yes there is one single robots.txt thats already there that i want to use for all the subdomains.

tangor

2:36 am on Jan 15, 2011 (gmt 0)

WebmasterWorld Senior Member tangor is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



robots.txt is not a panacea... and a robots.txt that addresses subdomains will have an impact on the top domain... and robots.txt inside subdomains have less than optimal results. Perhaps .htaccess is the better place to address any issues regarding SE crawls of the website in general, and subdomains in particular?

jdMorgan

4:29 pm on Jan 16, 2011 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



If you have code that rewrites subdomain requests to subdirectories to implement "multiple subdomains on one server," and you wish to use one single/common robots.txt file for all domains and subdomains, then the answer would be to exclude requests for robots.txt from being rewritten to the subdomain subdirectories.

In other words, change the subdomain rewrite code (which was not posted) from something like

RewriteCond $1 !^subdomain-directories/
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule ^(.*)$ /subdomain-directories/%1/$1

to something like

RewriteCond $1 !^(robots\.txt$|subdomain-directories/)
RewriteCond %{HTTP_HOST} !^www\.example\.com
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com
RewriteRule ^(.*)$ /subdomain-directories/%1/$1

to exclude any robots.txt requests from being rewritten to the subdomain-specific subdirectories.

Note that you could indeed use an "exclusion rule" above this subdomain-to-subdirectory rewrite if that is what your code was intended to implement. In that case, the proper syntax would have been:

RewriteRule ^robots\.txt$ - [L]

to specify "Do nothing, just quit here if robots.txt is requested."

Jim
 

Featured Threads

Hot Threads This Week

Hot Threads This Month