homepage Welcome to WebmasterWorld Guest from 54.146.190.193
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Need advice on my robots.txt file
new to robots.txt, can someone check this file
jwa55121

5+ Year Member



 
Msg#: 3055131 posted 12:34 am on Aug 22, 2006 (gmt 0)


I'm new to writing robots.txt files, I've read a few tutorials and still not sure about my robots.txt. My question is, do I still need this code [User-agent: * Disallow:] in the robots.txt file to give permission to the major search engines to index my site and is having that in there giving them permission to ignore the rest of the text file below it?

This is what's in my file:

User-agent: *
Disallow:

User-agent: Titan
Disallow: /

User-agent: EmailCollector
Disallow: /

User-agent: EmailSiphon
Disallow: /

User-agent: EmailWolf
Disallow: /

User-agent: ExtractorPro
Disallow: /

User-Agent: Googlebot-Image
Disallow: /images/

User-Agent: *
Disallow: /cgi-bin/
Disallow: /encrypt/
Disallow: /gotrythis/
Disallow: /rank/

User-Agent: Scooter
Disallow: /

Thanks for any help,
Jim

 

abates

10+ Year Member



 
Msg#: 3055131 posted 3:54 am on Aug 22, 2006 (gmt 0)

You have two "User-Agent: *" blocks. You should get rid of the first one, otherwise it will allow spiders to crawl the URLs you've disallowed in the second block.

bicycling

5+ Year Member



 
Msg#: 3055131 posted 4:53 am on Aug 22, 2006 (gmt 0)

Google has a robot.txt checker at [google.com...] you might want to try that.
And I think your first line is best advised removed.
User-agent: *
Disallow:

The above mentioned means to disallow all bots from indexing your site.
Unless you dont care about getting ranked in the SERPS you can leave that there :)

jwa55121

5+ Year Member



 
Msg#: 3055131 posted 5:07 pm on Aug 22, 2006 (gmt 0)

Thanks abates for the info, it's very helpful.

And to bicycling who wrote:
====================================
Google has a robot.txt checker at [google.com...] you might want to try that.
And I think your first line is best advised removed.
User-agent: *
Disallow:

The above mentioned means to disallow all bots from indexing your site.
Unless you dont care about getting ranked in the SERPS you can leave that there :)
=====================================

I would like to thank you for your reply bicycling and I beleive the block above doesn't request the robots not to index but to index the site, this is the block that disallows:

User-agent: *
Disallow: /

There has to be a / in the block to disallow spiders from indexing, that is what everybody else is saying on the net. Also thanks for the location of that tool.

Jim

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved