| Welcome to WebmasterWorld Guest from 220.127.116.11 |
register, login, search, subscribe, help, library, PubCon, announcements, recent posts, open posts,
|Accredited PayPal World Seller|
To use... or not to use? That is the question
| 1:23 am on Oct 30, 2003 (gmt 0)|
I allow all content on my site to be crawled. With that said, should I simply not use robots.txt, or have one with the following lines :
Would having the latter of the two improve the number of pages crawled by the spiders?
| 1:28 am on Oct 30, 2003 (gmt 0)|
Make it even easier and just through in an empty robots.txt. No muss, no fuss, no confusion, no 404s.
| 1:40 am on Oct 30, 2003 (gmt 0)|
The robots.txt protocol only allows for "disallow:" statements (not allow: statements) and wildcards don't belong in the disallow.
What you need if you want to allow all spiders to roam your site without restriction is:
| 2:13 am on Oct 30, 2003 (gmt 0)|
The most logical approach is not to have one; the only function of that file is to disallow access (hence the syntax as explained by tedster).
The only downside is that you will get lots of 404s in your log files. Should you want to eliminate them then use an empty file as suggested by jimbeetle.
| 7:10 am on Oct 30, 2003 (gmt 0)|
Extended discussion here...
Google and having *no* robots.txt file
could this be hurting your site?
| 1:29 am on Nov 5, 2003 (gmt 0)|
I noticed more of my pages are getting indexed now that I'm using a robots.txt that allows everything to be crawled.
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld ® and PubCon ® are a Registered Trademarks of Pubcon Inc.
© Pubcon Inc. 1996-2012 all rights reserved