Forum Moderators: phranque
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} ^ask\ jeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^askjeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^slurp@inktomi [OR]
RewriteCond %{HTTP_USER_AGENT} ^wisenutbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^alexa [OR]
RewriteRule ^.* - [F,L
my site then went haywire an hour later and crashed with 100's of these errors:
public_html/.htaccess: RewriteRule: bad flag delimiters
Can anyone help me? This looks like a great resource so I am subscribing. Thanks!
Welcome to WebmasterWorld!
Your RewriteRule is missing the closing "]".
The last RewriteCond *must not* have an [OR] flag on it. The [OR] flag means "logical or"', and you cannot 'or' a RewriteRule with a RewriteCond.
Slurp now belongs to Yahoo, which bought Inktomi.
You'd be far better off using robots.txt to ask these robots to leave your site (or parts of it) alone. All the 'bots on your list will respect robots.txt. Robots that do not respect robots.txt can then be blocked using mod_rewrite.
Jim
I did have the ] in the actual htaccess i must have missed it when I copied and pasted. I set my robots.txt to block everything about 10 days ago but it has not kicked in yet. Are you saying it should look like this? That the below is now a correct banning setup?
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} ^ask\ jeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^askjeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^slurp@inktomi [OR]
RewriteCond %{HTTP_USER_AGENT} ^wisenutbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^alexa
RewriteRule ^.* - [F,L]
Legitimate web spiders will request robots.txt and if you want to block them you do so by listing the Name of their robot along with the rules you want it to follow.
In .htaccess you can restrict access according to the User Agent - usually different from the Name you would use in robots.txt
Sample user agents:
Mozilla/2.0 (compatible; Ask Jeeves/Teoma)
Googlebot/2.1 (+http://www.google.com/bot.html)
Googlebot-Image/1.0
msnbot/1.0 (+http://search.msn.com/msnbot.htm)
Your rules above will ONLY block msnbot as it's the only one that matches a regular expression ("^msnbot")
;)