homepage Welcome to WebmasterWorld Guest from 54.242.140.11
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
Forum Library, Charter, Moderators: goodroi

Sitemaps, Meta Data, and robots.txt Forum

    
Confirm I've done this right please
Essel

10+ Year Member



 
Msg#: 294 posted 4:29 pm on Feb 19, 2004 (gmt 0)

The contents of my robots.txt file are:

User-agent: googlebot
Disallow: *
User-agent: scooter
Disallow: *
User-agent: lycos
Disallow: *

I'm trying to ban google, altavista and lycos.

Thanks

 

Alternative Future

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 294 posted 4:33 pm on Feb 19, 2004 (gmt 0)

Hiya,

Is it not

Disallow: /

/ rather than the *?
The * might work am open to be corrected on this one!

-gs

Essel

10+ Year Member



 
Msg#: 294 posted 4:35 pm on Feb 19, 2004 (gmt 0)

You're correct according to [robotstxt.org...]

Thanks. The bit I'm unsure about is if i can do

User-Agent: this, that, google, lycos, another
Disallow: /

Alternative Future

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 294 posted 4:37 pm on Feb 19, 2004 (gmt 0)

Don't think so just checked WW one and some other larger websites and they list each one on an individual basis...

edit according to link you gave you could use the * for all known robots i.e.
User-agent: *
Disallow

This would ban all known robots that obey the robots.txt

-gs

bakedjake

WebmasterWorld Administrator bakedjake us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 294 posted 4:43 pm on Feb 19, 2004 (gmt 0)

AF, According to A Standard for Robot Exclusion [robotstxt.org], you are correct.

It should be:

User-agent: googlebot
Disallow: /
User-agent: scooter
Disallow: /
User-agent: lycos
Disallow: /

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 294 posted 5:04 pm on Feb 19, 2004 (gmt 0)

It really should be...

User-agent: googlebot
Disallow: /

User-agent: scooter
Disallow: /

User-agent: lycos
Disallow: /

Essel

10+ Year Member



 
Msg#: 294 posted 5:04 pm on Feb 19, 2004 (gmt 0)

"This would ban all known robots that obey the robots.txt"

Is it possible to ban everything except Examplebot?

Does this work?

Allow: Examplebot

bakedjake

WebmasterWorld Administrator bakedjake us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 294 posted 5:07 pm on Feb 19, 2004 (gmt 0)

por: that's what i meant. ;-)

Does this work?

It depends if Examplebot honors the allow directive. robots.txt, don't forget, is not access control. It's a voluntary thing that the robots do.

Not all spiders read robots.txt, and some spiders accept proprietary parameters in robots.txt.

pageoneresults

WebmasterWorld Senior Member pageoneresults us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 294 posted 5:11 pm on Feb 19, 2004 (gmt 0)

Here's a great topic from jdMorgan in regards to the robots.txt file...

Put your robots.txt on a diet [webmasterworld.com]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Sitemaps, Meta Data, and robots.txt
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved