| How do I block a sub-domain? How do I block a dub-domain if the sub-domain is a virtual directory? |
shaunm

msg:4480534 | 12:26 pm on Jul 31, 2012 (gmt 0) | Hi, Can I block a page in my sub-domain from the robots.txt file which is in the root of my main domain? Example: abc.example.com/preview.aspx How do I block the above page/url from the root level? This sub-domain is a virtual directory. If I am going to add anything it will only be on the root directory of my main domain. Please advice with examples. Thanks a lot!
|
g1smd

msg:4480579 | 1:55 pm on Jul 31, 2012 (gmt 0) | Place the new robots.txt in your file system such that it will be accessible when subdomain.example.com/robots.txt is requested. This is the only way to do it.
|
phranque

msg:4484456 | 1:14 am on Aug 14, 2012 (gmt 0) | http://developers.google.com/webmasters/control-crawl-index/docs/robots_txt [developers.google.com]: | The robots.txt file must be in the top-level directory of the host, accessible though the appropriate protocol and port number. |
| | It is not valid for other subdomains, protocols or port numbers. It is valid for all files in all subdirectories on the same host, protocol and port number. |
|
|
shaunm

msg:4485359 | 7:51 am on Aug 16, 2012 (gmt 0) | @phranque Thank you so much for that, mate! Best,
|
phranque

msg:4485793 | 8:06 am on Aug 17, 2012 (gmt 0) | i forgot to include the reference from the actual Robots Exclusion Protocol - http://www.robotstxt.org/robotstxt.html [robotstxt.org]: Where to put it The short answer: in the top-level directory of your web server. The longer answer: When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place. For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt". So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software. |
|
|
shaunm

msg:4485831 | 10:38 am on Aug 17, 2012 (gmt 0) | @phranque Once again, thanks :)
|
|
|