Forum Moderators: martinibuster

Message Too Old, No Replies

is robots.txt safe?

how to allow SE access and disallow crackers

         

nucleus

6:58 pm on Apr 10, 2004 (gmt 0)

10+ Year Member



Hi, kind of new to robots.txt. It seems unsafe to me to put out a robots.txt on my root because everybody knows that is where it is supposed to be. I DO want SE's to read it of course and not index pages with irrelevant/sensitive info. What is the solution (minus using session tracking or login requirements on sensitive pages)? Can i give access to SE's but deny everybody else? Can somebody impersonate an SE?

Thanks.

pmkpmk

7:04 pm on Apr 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi nucleus, welcome to Webmasterworld!

I'm afraid there is no solution for this. You need to have the robots.txt to shut off certain areas of your site where spiders are not supposed to be.

But at the same time, this is - of course - a dead giveaway for anybody who means harm. I myself read robots.txt of competitors sites regularly, and any cracker/hacker who's looking for interesting stuff is bound to look at it too.

There are so called spambot-traps, which use robots.txt to lure harmful bots into visiting non-permitted pages so they can lock out the IP-address. But it's only a partial solution.

nucleus

7:26 pm on Apr 10, 2004 (gmt 0)

10+ Year Member



thanks for the input. i've been thinking. only a moron would put specific URI's in a public robots.txt. i guess maybe a smart thing to do would be just disallow an entire folder, place all "disallowable files" there and then don't provide a standard default index page for that folder. you of course couldn't place a link to that index page either unless it was on a disallowed "meta robots none" page.

so if i had, e.g.:
User-agent: *
Disallow: /private/

then what a cracker would have to do is search from the root and follow links manually (unless programs like xenu's link sleuth disregard 'robots.txt/meta noindex nofollow' info -- i'll have to test this out if nobody already has) until s/he found their way inside.

pmkpmk

7:34 pm on Apr 10, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Can you give us a hint on WHAT you actually want to hide?

I did a little "spying/hacking" on the site in your profile. Seems you have your subdirectories mentioned in robots.txt already reasonably safe for the occasional skript kiddie.

conor

11:22 am on Apr 18, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



why not cloak your robots file? if yoare that worried.