Forum Moderators: goodroi
Why are you redirecting from www.example.com ?
Many people redirect from www.example.com/index.html to www.example.com/ and it is good practice to have all internal links direct to '/' or to www.example.com/
But I don't think that was your question, was it?
If you want to prevent pages being indexed, you can use the noindex meta tag, direct on each page
I think I need to use .htaccess to stop redirecting robots.txt or any other method to avoid some directories to get indexed. Because there are 100+ pages, which are in that directory and it really a heck for me to edit all the pages to add noindex meta tags
Heres my HTACCESS
RewriteEngine On
RewriteBase /
#
# pointing for the domain domain1.com to folder1
ReWriteCond %{HTTP_HOST} ^(www\.)?site1\.com [NC]
ReWriteCond %{REQUEST_URI} !index/
ReWriteRule ^(.*)$ http://www.site1.com/index/$1 [R=301,L]
#
# pointing for the domain domain2.com to folder2
ReWriteCond %{HTTP_HOST} ^(www\.)?site2\.com [NC]
ReWriteCond %{REQUEST_URI} !home/
ReWriteRule ^(.*)$ http://www.site2.com/home/$1 [R=301,L]
#
# pointing for the domain domain2.com to folder2
ReWriteCond %{HTTP_HOST} ^(www\.)?site3\.com [NC]
ReWriteCond %{REQUEST_URI} !page/
ReWriteRule ^(.*)$ http://www.site3.com/page/$1 [R=301,L]
The proper approach is to internally rewrite these requests, so that the /index, /home, and /page subdirectories remain invisible to the client (and the user), and to link to the pages within your three sites as if they were all in the root directory:
RewriteEngine On
#
# Redirect to canonical www subdomains
#
# If requested hostname contains "site1.com"
RewriteCond %{HTTP_HOST} site1\.com [NC]
# but is not [i]exactly[/i] "www.site1.com"
RewriteCond %{HTTP_HOST} !^www\.site1\.com$
# then externally redirect to canonical hostname
RewriteRule (.*) http://www.site1.com/$1 [R=301,L]
#
RewriteCond %{HTTP_HOST} site2\.com [NC]
RewriteCond %{HTTP_HOST} !^www\.site2\.com$
RewriteRule (.*) http://www.site2.com/$1 [R=301,L]
#
RewriteCond %{HTTP_HOST} site3\.com [NC]
RewriteCond %{HTTP_HOST} !^www\.site3\.com$
RewriteRule (.*) http://www.site3.com/$1 [R=301,L]
#
# Internally rewrite requests for site1 to subdirectory /index/
RewriteCond %{HTTP_HOST} ^www\.site1\.com$
RewriteCond $1 !^index/
RewriteRule (.*) /index/$1 [L]
#
# Internally rewrite requests for site2 to subdirectory /home
RewriteCond %{HTTP_HOST} ^www\.site2\.com$
RewriteCond $1 !^home/
RewriteRule (.*) /home/$1 [L]
#
# Internally rewrite requests for site3 to subdirectory /page
RewriteCond %{HTTP_HOST} ^www\.site3\.com$
RewriteCond $1 !^page/
RewriteRule (.*) /page/$1 [L]
But you should then be able to put a robots.txt file into each site's subdirectory, and then google and the other search engines will be able to fetch robots.txt for each domain normally.
The corrections above also fix a bad error: The URL-path "seen" by RewriteRule in .htaccess never starts with a slash, but the URL-path seen by RewriteCond %{REQUEST_URI} always starts with a slash. Therefore, the code you posted may have caused redirection looping, until the browser or server reached its maximum redirection limit.
Jim
I use what is known as 'reseller' hosting which allows me to host multiple domains on one account. Each domain is stored in folders like 'index' and 'home' and 'page', but the live site doesn't reference those names; the actual domain name is pointed to the folder, not to the area above the folder.
It is very odd to need to use a folder within a domain name.
Its so odd that I suspect that you have misunderstood how your hosting works - do you have these sites live,or are you still working on them?
If you are live and this is indeed how it works, I would sugest you consult your host or perhaps their help files, as I am sure this is a standard question they have to deal with as it is a flaw in their hosting.
Here is the syntax I have on a site I am redirecting to another domain, but want the bots to see the robots:
RewriteCond %{REQUEST_URI} !/robots.txt
RewriteRule ^(.*)$ http://www.site.com/folder/ [R=301,L]
Hope it helps
Lea
In some ways, it's superior to the standard Control-Panel method of putting each "add-on" domain into a fixed subdirectory, because it allows the hosting client to easily share files (e.g. scripts) between the domains, unlike the control panel method, which makes this difficult or impossible.
The main problem here is confusion between URLs and files (and their relationship, which is "associative" and not fixed), and between external redirects and internal rewrites.
We've covered these subjects thoroughly in the Apache forum and the Apache section of the WebmasterWorld Library, so I won't repeat all that here.
Jim