homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

How do I block a sub-domain?
How do I block a dub-domain if the sub-domain is a virtual directory?

 12:26 pm on Jul 31, 2012 (gmt 0)


Can I block a page in my sub-domain from the robots.txt file which is in the root of my main domain?


How do I block the above page/url from the root level?

This sub-domain is a virtual directory. If I am going to add anything it will only be on the root directory of my main domain.

Please advice with examples.

Thanks a lot!



 1:55 pm on Jul 31, 2012 (gmt 0)

Place the new robots.txt in your file system such that it will be accessible when
subdomain.example.com/robots.txt is requested. This is the only way to do it.

 1:14 am on Aug 14, 2012 (gmt 0)

http://developers.google.com/webmasters/control-crawl-index/docs/robots_txt [developers.google.com]:
The robots.txt file must be in the top-level directory of the host, accessible though the appropriate protocol and port number.

It is not valid for other subdomains, protocols or port numbers. It is valid for all files in all subdirectories on the same host, protocol and port number.


 7:51 am on Aug 16, 2012 (gmt 0)


Thank you so much for that, mate!



 8:06 am on Aug 17, 2012 (gmt 0)

i forgot to include the reference from the actual Robots Exclusion Protocol - http://www.robotstxt.org/robotstxt.html [robotstxt.org]:
Where to put it
The short answer: in the top-level directory of your web server.
The longer answer: When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place. For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt". So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.


 10:38 am on Aug 17, 2012 (gmt 0)


Once again, thanks :)

Global Options:
 top home search open messages active posts  

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved