Welcome to WebmasterWorld Guest from 54.167.46.29

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Hiding robots.txt from browsers

getting 501 error

     
6:47 pm on Nov 7, 2007 (gmt 0)

Junior Member

5+ Year Member

joined:July 27, 2007
posts: 76
votes: 0


Hello ,

I am trying to hide robots.txt from browsers but not from robots.

I am using this code in my htaccess file:

RewriteCond {HTTP_USER_AGENT} ^Mozilla
RewriteCond %{HTTP_USER_AGENT}!(Slurp¦surfsafely)
RewriteRule ^robots\.txt$ /someotherfile [L]

But then i get 501 error.

Something wrong with the code?

Thanks

3:24 pm on Nov 9, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Aug 1, 2003
posts:815
votes: 0


I'm sorry, I have to ask what you are trying to accomplish here.

Anyone who knows what a robots.txt file is and wants to read yours will be able to just spoof the user agent. I don't think that's any kind of secret hacker knowledge. And regular users will never see the robots.txt file.

If you really want to down your robots.txt file down, you may have better luck by allowing access only to IP addresses of known spiders.

3:55 pm on Nov 9, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


I agree. robots.txt and your custom 403 error page are two files which should be universally-accessible, even to banned IP addresses and user-agents. The reasoning here is that if robots.txt is inaccessible, robots are likely to interpret that as meaning that the entire site may be spidered. And of course, if you return a 403 response when a banned user-agent tries to access your custom 403 page, then your server ends up in a loop.

That said, I'm not sure about a 501-Not Implemented error, but the missing "%" in your first RewriteCond may cause a 500-Server Error.

Jim

5:42 pm on Nov 9, 2007 (gmt 0)

Junior Member

5+ Year Member

joined:July 27, 2007
posts: 76
votes: 0


In fact i want to hide it from my competitors.

I derived hiding robots.txt from here:
[webmasterworld.com...]

Thanks

9:35 pm on Nov 9, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


Be prepared to monitor it closely then, and do not make any mistakes with user-agents or IP address ranges. Check the major search engine spider IP address ranges at least once every week so you don't block them and lose your rankings. This is cloaking, and cloaking successfully is a full-time job.

Just my opinion, but I think there are better things to spend your time on. Why not let your competitors see your robots.txt, but put a few extra Disallow entries in there that don't really exist, and are not linked from anywhere on the Web? Then if you ever see an access to one of those Disallowed fake URL-prefixes, you can rewrite it to a script that bans the IP address. ;)

Jim

9:39 pm on Nov 9, 2007 (gmt 0)

Junior Member

5+ Year Member

joined:July 27, 2007
posts: 76
votes: 0


Cool trick ..

I will try it.

Thanks.