Welcome to WebmasterWorld Guest from 54.234.114.182

Forum Moderators: goodroi

Message Too Old, No Replies

Is there a "best" robot txt to get all pages ranked?

What is very best robot txt coding to insure Google spider index all pages?

     
5:46 am on Jul 13, 2003 (gmt 0)

New User

10+ Year Member

joined:July 10, 2003
posts:4
votes: 0


Hello,

I a SEM newbie and wanted to know what is the absolute best robot txt coding one should use to insure Google and the major SE's spiders index all of my sites pages?

I've seen the following used:

<meta content="index,follow" name="robots">

<meta name="robots" content="all">

<meta content="all" name="robots">

Which one is best? Or are they all equally good and does it even matter which one I use?

Next, is there an even better one than these three? If so, what is it?

Lastly, is there any other key coding/files I need to insure spiders index all of my sites page? If so, please list with enough details on how to implement for us newbies!

Many thanks,

Randall

5:54 am on July 13, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 8, 2003
posts:659
votes: 0


If you have a small site you could just include an empty robots.txt file in the root directory.
6:01 am on July 13, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 5, 2002
posts:413
votes: 0


'empty' has always worked for me!
12:58 pm on July 13, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2002
posts:1192
votes: 0


Randall,

Welcome to WebmasterWorld!

Robots are paid to index all your files, that is their task in life. So they will index everything unless you forbid them.

The various thingies you mention have only one useful task: to keep robots out. Since you want them these things are useless.

The main value of having a totally empty robots.txt file, rather than simply not having one, is to prevent error messages in your logs. Every well-behaved robot begins each session by requesting the robots.txt file, and if it is not there you get an error message.

> Lastly, is there any other key coding/files I need to insure spiders index all of my sites page?

Clear navigation helps. If the user has to click more than a few times (around three) to get from the index page to a given internal page that page has a lesser chance of being indexed. In their Webmaster Guidelines [google.com] google suggests:

Offer a site map to your users with links that point to the important parts of your site. If the site map is larger than 100 or so links, you may want to break the site map into separate pages.
1:13 pm on July 13, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Jan 27, 2003
posts:166
votes: 0


The lines you quote are not a robots.txt file but meta tags in the HTML head.

If you want to have robots index everything you do not need any <meta name=robots" ... lines at all.

Regarding the robots.txt file (which is not part of your web pages but a separate file at your domain root), if you do not want to disallow spiders from everything you could e.g.

a) put an empty robots.txt file (http://www.example.com/robots.txt) at your domain root

b) create a robots.txt file referencing a directory that does not exist, e.g.


User-agent: *
Disallow: /ficticious-directory/

Advantage of this: you can build on this syntax if you later do want to keep spiders from some parts of your site.

The worst alternative is not to have a robots.txt file because
1) this swamps the real errors in your error log
2) some webspaces are set up to serve the default page (rather than an error page) as answer to a request for a file that does not exist. I have seen quite a few sites where a request for robots.txt returns the home page of the site - a search engine spider programmed in a non-robust way might choke on that.

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members