Forum Moderators: phranque

Message Too Old, No Replies

Can't block site. Does not read robots.txt. Gets around .htaccess file

         

Joey123

8:19 pm on Aug 15, 2005 (gmt 0)

10+ Year Member



Using the .htaccess file, I have blocked the IP I get when I check my website directly through dead-links.com and I have blocked the IP listed to dead-links.com in WHOIS.

This website has been my last 300 hits. I want it stopped. How is it getting around the .htaccess file? What else can I do?

Here is my .htaccess file:

<Files .htaccess>
order allow,deny
deny from all
</Files>

#problem-site.com

<Limit GET>
order deny,allow
deny from 216.xx.xx.0
deny from 216.xx.xx.255
</Limit>

#problem-site.com

<Limit GET>
order deny,allow
deny from 64.xx.xx.0
deny from 64.xx.xx.255
</Limit>

RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_agent} ^online_link_validator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]

[edited by: jatar_k at 8:25 pm (utc) on Aug. 15, 2005]
[edit reason] removed specifics [/edit]

jdMorgan

10:00 pm on Aug 15, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Only one Order directive is allowed per .htaccess file; Others will be ignored. So that's probably the main problem. If you Limit your denies to only GET requests, then POST, DELETE, etc. will not be denied. If you wish to deny a range of addresses, say from 192.123.123.0 through 192.123.123.255, then you must use a form of the Deny from directive that specifies a range. The Order directive does not specify the order of your Allow/Deny directives in the .htaccess file, it specifies their priority. See mod_access [httpd.apache.org] for more details.

Because of the single-Order directive requirement, let's move your .htaccess file protection into mod_rewrite just to keep things simple.


# Set up for mod_access code below
SetEnvIf Request_URI "(custom_403_page\.html¦robots\.txt)$" allowit
#
<Files *>
Order Deny,Allow
#
# Deny from problem-site.com
Deny from 216.xx.xx.0/24
Deny from 64.xx.xx.0/24
#
# Allow [i]everybody[/i] access to robots.txt and custom 403 error page
Allow from env=allowit
#
</Files>
#
# Block access to .htaccess and block specific user-agents
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_URI} \.htaccess$ [OR]
RewriteCond %{HTTP_USER_agent} ^online_link_validator [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule .* - [F]

This code will blcok access to the requested files. It will not stop the requests from being logged in your access log or in your 'stats'.

Jim