Welcome to WebmasterWorld Guest from 54.226.241.8

Forum Moderators: goodroi

Message Too Old, No Replies

Should this file be "Public"

Is it good practice to have the robots.txt file.......

     

Propools

4:04 pm on Sep 25, 2007 (gmt 0)

10+ Year Member



Is it good practice to have the robots.txt file available for everyone on the web to see?
Because then it may be advantageous for those less scrupulous people to see which directories, if any, you don't want the robots to crawl.

If it's possible to only allow robots to see this file, then what is the best method for doing this? :)

Matt Probert

5:22 pm on Sep 25, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You seem to misunderstand the "robots.txt" file. This file is a purely voluntary request to robots. Many robots ignore it, hackers certainly aren't going to worry about it, if a directory has links into it, it will be found by those who so wish, irrespective of any robots.txt file.

Matt

Propools

5:24 pm on Sep 25, 2007 (gmt 0)

10+ Year Member



No, I understand that it's a voluntary file. I would just like to voluntarily put it out there but also be able to limit who see's it. ;)

goodroi

11:58 am on Sep 26, 2007 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



You can use IP delivery aka cloaking to serve the robots.txt to bots coming from google/yahoo/msn ip addresses and show all other ip addresses a 404 error. This is a little tricky since the ip addresses search engines use change over time and you need to maintain it.

A simpler solution which I prefer to use myself is to use htaccess to block sensitive areas of my website. I use robots.txt more to deal with duplicate content issues.

You also can make a bot trap. Add a folder to the robots.txt file and do not list it anywhere else. Then wait and see what hits that folder and then ban that ip address. Since the only way to find that folder is from robots.txt you know it is a misbehaving bot or a hacker - either way you don't want it on your site.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month