Welcome to WebmasterWorld Guest from 54.205.20.160

Forum Moderators: goodroi

Message Too Old, No Replies

Non-white list blocking

With a sting

   
12:58 am on Sep 26, 2008 (gmt 0)

5+ Year Member



For non-white listed bots Iím currently generating (php) a robots.txt file that looks like this:

User-agent: *
Disallow: /

For reasons I prefer to not to explain atm is this allowed:

User-agent: *
Disallow: /
Disallow: /rpqtewz/

Does order make any difference?) - or:

User-agent: *
Disallow: /rpqtewz/
Disallow: /

Or would something like this be better (might give me more options):

User-agent: *
Disallow: /rpqtewz.gif
Disallow: /

Thanks,
Phred

2:09 pm on Sep 29, 2008 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Hi Phred,

I am not sure I understand what you are trying to do.

When you use "Disallow: /" in your robots.txt it is telling the robots.txt to not visit anything. So it does not matter if you also list specific folders to disallow since you have already told the robots to disallow every folder.

8:55 pm on Sep 29, 2008 (gmt 0)

5+ Year Member



Bot trap - anyone hitting the file or directory could have only known about them from robots.txt - take appropriate action. A unique generated name that allows tracking back to, among other things, date, time, ip, ua.

Phred

12:51 pm on Sep 30, 2008 (gmt 0)

WebmasterWorld Administrator goodroi is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I would argue that you have made your entire site into a bot trap. I can understand why you would want to list one specific folder to make it easier to identify bad bots. Since you are looking for bad bots the order does not matter. Bad bots do not honor robots.txt and often are looking to exploit it.

In the past I have had fun with creating bot traps. I create folders that human spies and bad bots would love to get into that do not really exist on my sites. Here is a quick list of folder names I have used for bot traps:
/creditcardnumbers/
/customerdatabase/
/salesreport/
/passwords/
/private/
/ssn-data/
/secret/