Forum Moderators: phranque
[edited by: incrediBILL at 11:46 pm (utc) on Jan 12, 2014]
[edit reason] line breaks [/edit]
# libww Blocks W3C-checklink/4.81 libwww-perl/5.836
SetEnvIf Remote_Addr ^128\.30\.52 !keep_out RewriteCond %{HTTP_USER_AGENT} "Firefox/10\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/11\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/12\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/13\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/14\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/15\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/16\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/17\." [NC,OR]
RewriteCond %{HTTP_USER_AGENT} "Firefox/19\." [NC,OR]
deny from colocall.net
<IfModule mod_setenvif.c>
htaccess Performance Killers
I find in most cases for me blocking domains are better than by endless list of IP's.
Why is blocking a domain slower than a list of IP's?
Cleaning up an htaccess file
Step 1: Organize. Collect all the directives for each module in one place. The server doesn't care, but you-- and anyone who comes along after you-- will appreciate it.
Tip: Use a text editor with a "Find All" window to pull up all lines beginning with the element "Rewrite..." That takes care of mod_rewrite; dump them all at the end for now.
Step 2: Get rid of all <IfModule> envelopes. Not their contents, just the envelopes themselves. These envelopes are hallmarks of mass-produced htaccess files that have to work anywhere, on any server. You are now on your own site. Any given mod is either available to you or it isn't.
Step 3: Sort by module. The server doesn't care what order the directives are listed in, or even if rules from different modules all garbled together. Each module works separately, seeing only its own directives. But humans need to be able to find things.
For most people it will be most practical to group one-liners at the beginning:
Options -Indexes
is a good start. If your htaccess file contains only one line, that's probably it. Other quick directives are ones starting with words like AddCharset or Expires. Then list your error documents.
If you have any very short Files or FilesMatch envelopes, put them near the top too. For example:<Files "robots.txt">
Order Allow,Deny
Allow from all
</Files>
<FilesMatch "\.(css|js)">
Header set X-Robots-Tag "noindex"
</Files>
Be sure to have an "Allow from all" envelope for your custom 403 page. If you are on shared hosting and they provide default error-document names such as "forbidden.html", this has probably already been done in the config file. But it does no harm to repeat it.
Step 4: Consolidate redirects.
Step 4a: Get rid of mod_alias. If your htaccess file contains any mod_rewrite directives, it can't use mod_alias (Redirect... by that name), or things may happen in the wrong order. For large-scale updating, use these Regular Expressions, changing \1 to $1 if that's what your text editor uses. Each of these can safely be run as an unsupervised global replace.
# change . to \. in pattern
^(Redirect \d\d\d \S+?[^\\])\.
TO
\1\\.
# now change Redirect to Rewrite
^Redirect(?:Match)? 301 /(.+)
TO
RewriteRule \1 [R=301,L]
# and if needed
^Redirect(?:Match)? 410 /(.+)
TO
RewriteRule \1 - [G]
^Redirect(?:Match)? 403 /(.+)
TO
RewriteRule \1 - [F]
Step 4b: Sort your RewriteRules. At the beginning is the single line
RewriteEngine on
A RewriteBase is almost never needed; get rid of any lines that mention it. Instead, make sure every target begins with either protocol-plus-domain or a slash / for the root.
Sort RewriteRules twice.
First group them by severity. Access-control rules (flag [F]) go first. Then any 410s (flag [G]). Not all sites will have these. Then external redirects (flag [R=301,L] unless there is a specific reason to say something different). Then simple rewrite (flag [L] alone). Finally, there may be a few rules without [L] flag, such as cookies or environmental variables.
Function overrides flag. If your redirects are so complicated that they've been exiled to a separate .php file, the RewriteRule will have only an [L] flag. But group it with the external redirects. If certain users are forcibly redirected to an "I don't like your face" page, the RewriteRule will have an R flag. But group it with the access-control [F] rules.
Then, within each functional group, list rules from most specific to most general. In most htaccess files, the second-to-last external redirect will take care of "index.html" requests. The very last one will fix the domain name, such as with/without www.
Leave a blank line after each RewriteRule, and put a# comment
before each ruleset (Rule plus any preceding Conditions). A group of closely related rulesets can share an explanation.
Step 5: Notes on error documents.
Reminder: ErrorDocument directives must not include a domain name, or else everything will turn into a 302 redirect. Start each one with a / representing the root.
Caution: Since each module is an island, any module that can issue a 403 must have its own error-document override. "Allow from all" covers mod_authzzzz. If you have RewriteRules that end in [F], make sure your 403 documents can bypass these rules.
[edited by: bill at 5:09 am (utc) on Jan 14, 2014]
[edit reason] fixed side-scroll [/edit]
RewriteCond %{HTTP_USER_AGENT} ^.*(360spider|80legs|
<ship>
<snip>
<snip>
|zoom|zyborg).*$ [NC]
RewriteRule . - [F,L]
Simply did NOT work
RewriteCond %{HTTP_USER_AGENT} ^.*(blahblah
email|emailsiphon|emailwolf