Robots.txt Validator bug?

Forum Moderators: goodroi

Message Too Old, No Replies

Robots.txt Validator bug?

Christian Storm from Turnitin Robot maintainers says my robots.txt is wrong

scottspence

9:43 am on Dec 16, 2002 (gmt 0)

Hi,

Basically this is the problem, my robots.txt says this:

User-agent: *
Disallow: /*/pass/
Disallow: /noodle/
Disallow: bad.html

Which according to the protocol, as far as I can tell, is wrong. But it is approved by the validator. Google obeys it but the Turnitin Robot (and possibly others) do not.

I have made changes like this:

User-agent: *
Disallow: /noodle/
Disallow: /bad.html

i.e. the main issues seemed to be the wild card and the absence of the full path.

Any suggestions? - is there another more recent protocol that I am missing or is this a bug?

Cheers

Scott

PS I do hope this is the correct place to post this message!

martinibuster

10:02 am on Dec 16, 2002 (gmt 0)

The validator has been known to be wrong [webmasterworld.com] in the past.

It's ok to be a little skeptical.

I don't use a robots.txt. I'm some people have valid reasons for using one for banning bad bots.

But if you're not banning bad bots, and simply telling bots to crawl you, I'd rather keep confusion at bay and not put one up.

That's just my way of doing things.

Brett_Tabke

11:15 am on Dec 17, 2002 (gmt 0)

>Which according to the protocol

I debated about that one for quite awhile. Is not necc wrong. As you stated, it is accepted by Google.

I went ahead and put it in as a warning instead of a full error.