Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

blocking anonymous user agents

but not from robots.txt



11:56 am on Jan 15, 2004 (gmt 0)

Inactive Member
Account Expired


I've been breaking my brain on this for a week now. I've got the following in a .htaccess file to block requests with no referrer or user agent:

RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule !^robots.txt$ - [F,L]

I'm trying to exclude robots.txt from this rule as at least one search engine makes the robots.txt request anonymously. However it's still being blocked:

193.***.115.6 - - [15/Jan/2004:04:11:44] "GET /robots.txt HTTP/1.1" 403 295 "-" "-"

What am I doing wrong?!?

[edited by: jdMorgan at 9:10 pm (utc) on Jan. 16, 2004]
[edit reason] Generalized specific IP address [/edit]

8:16 pm on Jan 15, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
votes: 0


It doesn't look like your code is broken. It may be that the user-agent is blocked due to some other reason.
You do need to escape the dot in robots.txt, and [L] used with [F] is redundant, but your code should have worked fine in this case (This looks almost like mine, which works).

RewriteCond %{HTTP_REFERER} ^$
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule !^robots\.txt$ - [F]

I assume that you have other working mod_rewrite code in your .htaccess file, and that this problem is not systemic. If this code is in httpd.conf, you'll need to add a "/" ahead of "robots.txt".



11:06 am on Jan 16, 2004 (gmt 0)

Inactive Member
Account Expired


Thanks - I've tried with and without escaping the dot with no apparent effect. My only thought is that the spider might be passing "-" instead of "" as the UA (they are Polish) so I'm going to try something like:

RewriteCond %{HTTP_USER_AGENT} ^-?$

[Edit - that really didn't make sense did it - they wouldn't be blocked in that case]