Welcome to WebmasterWorld Guest from 54.205.96.97

Forum Moderators: Ocean10000 & incrediBILL & phranque

Hiding robots.txt from browsers

getting 501 error

   
6:47 pm on Nov 7, 2007 (gmt 0)

5+ Year Member



Hello ,

I am trying to hide robots.txt from browsers but not from robots.

I am using this code in my htaccess file:

RewriteCond {HTTP_USER_AGENT} ^Mozilla
RewriteCond %{HTTP_USER_AGENT}!(Slurp¦surfsafely)
RewriteRule ^robots\.txt$ /someotherfile [L]

But then i get 501 error.

Something wrong with the code?

Thanks

3:24 pm on Nov 9, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm sorry, I have to ask what you are trying to accomplish here.

Anyone who knows what a robots.txt file is and wants to read yours will be able to just spoof the user agent. I don't think that's any kind of secret hacker knowledge. And regular users will never see the robots.txt file.

If you really want to down your robots.txt file down, you may have better luck by allowing access only to IP addresses of known spiders.

3:55 pm on Nov 9, 2007 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I agree. robots.txt and your custom 403 error page are two files which should be universally-accessible, even to banned IP addresses and user-agents. The reasoning here is that if robots.txt is inaccessible, robots are likely to interpret that as meaning that the entire site may be spidered. And of course, if you return a 403 response when a banned user-agent tries to access your custom 403 page, then your server ends up in a loop.

That said, I'm not sure about a 501-Not Implemented error, but the missing "%" in your first RewriteCond may cause a 500-Server Error.

Jim

5:42 pm on Nov 9, 2007 (gmt 0)

5+ Year Member



In fact i want to hide it from my competitors.

I derived hiding robots.txt from here:
[webmasterworld.com...]

Thanks

9:35 pm on Nov 9, 2007 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Be prepared to monitor it closely then, and do not make any mistakes with user-agents or IP address ranges. Check the major search engine spider IP address ranges at least once every week so you don't block them and lose your rankings. This is cloaking, and cloaking successfully is a full-time job.

Just my opinion, but I think there are better things to spend your time on. Why not let your competitors see your robots.txt, but put a few extra Disallow entries in there that don't really exist, and are not linked from anywhere on the Web? Then if you ever see an access to one of those Disallowed fake URL-prefixes, you can rewrite it to a script that bans the IP address. ;)

Jim

9:39 pm on Nov 9, 2007 (gmt 0)

5+ Year Member



Cool trick ..

I will try it.

Thanks.

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month