Welcome to WebmasterWorld Guest from 174.129.135.89

Forum Moderators: goodroi

Message Too Old, No Replies

How do I block a sub-domain?

How do I block a dub-domain if the sub-domain is a virtual directory?

   
12:26 pm on Jul 31, 2012 (gmt 0)



Hi,

Can I block a page in my sub-domain from the robots.txt file which is in the root of my main domain?

Example:
abc.example.com/preview.aspx

How do I block the above page/url from the root level?


This sub-domain is a virtual directory. If I am going to add anything it will only be on the root directory of my main domain.

Please advice with examples.


Thanks a lot!
1:55 pm on Jul 31, 2012 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Place the new robots.txt in your file system such that it will be accessible when
subdomain.example.com/robots.txt
is requested. This is the only way to do it.
1:14 am on Aug 14, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



http://developers.google.com/webmasters/control-crawl-index/docs/robots_txt [developers.google.com]:
The robots.txt file must be in the top-level directory of the host, accessible though the appropriate protocol and port number.


It is not valid for other subdomains, protocols or port numbers. It is valid for all files in all subdirectories on the same host, protocol and port number.
7:51 am on Aug 16, 2012 (gmt 0)



@phranque

Thank you so much for that, mate!

Best,
8:06 am on Aug 17, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



i forgot to include the reference from the actual Robots Exclusion Protocol - http://www.robotstxt.org/robotstxt.html [robotstxt.org]:
Where to put it
The short answer: in the top-level directory of your web server.
The longer answer: When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place. For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt". So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.
10:38 am on Aug 17, 2012 (gmt 0)



@phranque

Once again, thanks :)