Forum Moderators: goodroi
The only way to do it is to have a dynamic robots.txt file which displays a disallow to any request not from those you want whitelisted. This thread [webmasterworld.com] explains the basic idea.
My current robots.txt is this.
User-agent: *
Disallow: /
It has removed all my problems with robots but given me another one. My website traffic has very much dropped.
I noticed that google and yahoo were have had 90% of my search engine. And I truely only need to stop a total of 4 or 5 robots. (It does not appear to be worthy of the problem to make a dynamic script.)
Is it possible to do this in a robots.txt
User-agent: alexa, askjeeves, etc, etc
Disallow: /
or
User-agent: alexa,
User-agent: askjeeves,
User-agent: msnbot
User-agent: another bot.
Disallow. /
IE: Disallow the robots directly that i do no want and let the others come in.
My current robots.txt is this.User-agent: *
Disallow: /
Is it possible to do this in a robots.txtUser-agent: alexa
User-agent: askjeeves
User-agent: msnbot
User-agent: another-bot
Disallow: /
User-agent: alexa
Disallow: /User-agent: Teoma
Disallow: /User-agent: msnbot
Disallow: /User-agent: another-bot
Disallow: /User-agent: *
Disallow:
Jim
How would you know if the file is working or not?
you can use the robot.txt analysis feature in google webmaster tools [google.com] to see how it works for googlebot...
I did not want to just install the new file and wait for my stats to update the hits on the robots.txt file.--------
--------
update
I tested the file by excluding the google bot.
Google has the know knowledge of my site but it also claims the robots.txt file is blocking it.
Just for mes efforts I placed the google bot last.
User-agent: ia_archiver
Disallow: /
User-agent: Slurp
Disallow: /
User-agent: googlebot
Disallow: /
So for the premier person with the question it is possible to exclude the robots you do not want but leave blanks for the permission the robos you do want.
-------
Additional question.
is IA_archiver the only bot Alexa uses?
I did not see anything these more listed on robotstxt.org and I want to be certain.
[edited by: Michel_Samuel at 6:09 am (utc) on May 2, 2007]