Welcome to WebmasterWorld Guest from 54.159.179.132

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt Validator bug?

Christian Storm from Turnitin Robot maintainers says my robots.txt is wrong

   
9:43 am on Dec 16, 2002 (gmt 0)

10+ Year Member



Hi,

Basically this is the problem, my robots.txt says this:

User-agent: *
Disallow: /*/pass/
Disallow: /noodle/
Disallow: bad.html

Which according to the protocol, as far as I can tell, is wrong. But it is approved by the validator. Google obeys it but the Turnitin Robot (and possibly others) do not.

I have made changes like this:

User-agent: *
Disallow: /noodle/
Disallow: /bad.html

i.e. the main issues seemed to be the wild card and the absence of the full path.

Any suggestions? - is there another more recent protocol that I am missing or is this a bug?

Cheers

Scott

PS I do hope this is the correct place to post this message!

10:02 am on Dec 16, 2002 (gmt 0)

WebmasterWorld Administrator martinibuster is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The validator has been known to be wrong [webmasterworld.com] in the past.

It's ok to be a little skeptical.

I don't use a robots.txt. I'm some people have valid reasons for using one for banning bad bots.

But if you're not banning bad bots, and simply telling bots to crawl you, I'd rather keep confusion at bay and not put one up.

That's just my way of doing things.

11:15 am on Dec 17, 2002 (gmt 0)

WebmasterWorld Administrator brett_tabke is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



>Which according to the protocol

I debated about that one for quite awhile. Is not necc wrong. As you stated, it is accepted by Google.

I went ahead and put it in as a warning instead of a full error.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month