Creating A Robots.Txt File

How Do I Create A Correct robots.txt File?

6:33 am on Feb 14, 2005 (gmt 0)

joined:Feb 12, 2005
Hey Everyone,

I am new to this site. Just wanted to say hello and that this is an excellent site! I also have several questions and was hoping to get some feedback on them from some of the experienced webmasters in here.

1. How do I create a good correct robots.txt file? Which websites or discussions would assist me learn as much as possible about this broad topic and others associated with it?

2. What information resources (i.e. websites and books) can I use for reference for becoming a better webmaster?

3. Which websites are the best for researching information on SEO? I have purchased a program from my hosting provider for this called Traffic Blazer and wanted to know what other things I can do other than this since using more tools would help elevate the rankings I am interested in.

4. Which websites are the best for cross-referencing "visitors" and their IP Addresses to identify those who are doing all of the nefarious things many here say they are doing when visiting a website one has designed?

3:48 pm on Feb 14, 2005 (gmt 0)

4:27 pm on Feb 14, 2005 (gmt 0)

joined:Feb 14, 2005
As this is a robots.txt forum, you might get more joy in some of the other forums for your other questions. Briefly:

1. A robots.txt file is just a simple text file you can create in Notepad (called robots.txt of course) uploaded to the root of you site. The majority of search engine spiders request this file to see where they are allowed on the site. For the syntax to use, either look through the archives on this forums, or check out The Web Robots Pages (at [robotstxt.org ].)

What you put in it depends on what you are trying to achieve. A couple of basic ones are:

To exclude all robots from the entire server:

User-agent: *
Disallow: /

(i.e. '*' is a wildcard match for all robots, and '/' means the roots directory, and consequently any subdirectories of the site.)

To allow all robots complete access

User-agent: *

(i.e. Again, '*' is a wildcard match for all robots, and no directories are specified. If you just want a robots.txt file for the sake of having one, this is what to put in it.)

2. There have been a number of threads on this topic in the past. Your best bet is to search for something like 'best books' using the WebmasterWorld site search (link at the top of the page). For the record, I think every webmaster should read Steve Kruq's Dont Make Me Think and 37 Signals' Defensive Design For The Web.

3. For the newbie, you're probably at the best place you're going to find already. While I'm not going to knock Traffic Blazer, as I've never used it, the advice you'll get here is to be very careful if you use any sort of automated optimization or submission tools. There are some things you should be doing manually (optimization) and there are some things that aren't going to do you any good at all (repeated SE submission).

4. Nefarious things? Such as? Don't worry about it. Really. You're only likely to become a target if you're a player in a competitive industry. If you do become a target, checking IP addresses is unlikely to do you any good. ;)