Forum Moderators: goodroi
Basically this is the problem, my robots.txt says this:
User-agent: *
Disallow: /*/pass/
Disallow: /noodle/
Disallow: bad.html
Which according to the protocol, as far as I can tell, is wrong. But it is approved by the validator. Google obeys it but the Turnitin Robot (and possibly others) do not.
I have made changes like this:
User-agent: *
Disallow: /noodle/
Disallow: /bad.html
i.e. the main issues seemed to be the wild card and the absence of the full path.
Any suggestions? - is there another more recent protocol that I am missing or is this a bug?
Cheers
Scott
PS I do hope this is the correct place to post this message!
It's ok to be a little skeptical.
I don't use a robots.txt. I'm some people have valid reasons for using one for banning bad bots.
But if you're not banning bad bots, and simply telling bots to crawl you, I'd rather keep confusion at bay and not put one up.
That's just my way of doing things.