| Robots.txt & How your site is spidered Positives of Robots.txt file aside from disallowing bots |
floridadesigns

msg:1529575 | 3:38 pm on Jun 14, 2003 (gmt 0) | I have heard that most people use the Robots.txt to just disallow your specified spiders/bots. And all information relating to the robots.txt file always talk about disallow, ie: User-agent: grub Disallow: / But I wasn't sure how to use it to help your site. Maybe to direct the spiders to specified pages, or even use it somehow as a site-map/link list of all your pages. What are the different benefits of the robots.txt file aside from banning certain spiders & how would you set it up. Thanks alot for any help, just trying to clear things up a bit :)
|
jimbeetle

msg:1529576 | 9:56 pm on Jun 14, 2003 (gmt 0) | Hi floridadesigns, There's not much else you can do with robots.txt except disallow a bot or implicityly allow a bot by not disallowing it. The following allows googlebot and disallows all others: User-agent: googlebot Disallow: User-agent: * Disallow: / You can find all you need to know about robots.txt at robotstxt.org. Jim
|
floridadesigns

msg:1529577 | 11:15 pm on Jun 14, 2003 (gmt 0) | Ok good. thanks for clearing that up for me. That is what I always thought, but alot of people & websites are making it seem like its benefits extend much farther. If there is anything else useful I should know.... please let me know :)
|
jdMorgan

msg:1529578 | 11:31 pm on Jun 14, 2003 (gmt 0) | Robots.txt is useful to disallow specific robots from specific pages/scripts/images, etc. Like your meta-description (in some engines), it gives you additional control over the "presentation" of your site, allows you to prevent visiotrs from entering into your site on some random page, etc. Disallowing pest 'bots is a secondary function - although it is a much-discussed subject. A better example robots.txt might be: User-aqent: Googlebot User-agent: Slurp User-agent: Scooter Disallow: /shopping_cart/ Disallow: /cgi-bin/ Disallow: /mail/e-mailform.html Ref: A Standard for Robots Exclusion [robotstxt.org] HTH, Jim
|
|
|