Pfui

msg:4419256 | 4:40 am on Feb 19, 2012 (gmt 0) |
Your code may have other problems but for starters, try changing all of your -- [NC] [OR] -- notations to: [NC,OR]
|
lucy24

msg:4419272 | 7:01 am on Feb 19, 2012 (gmt 0) |
... and when you've done that, get rid of all your opening anchors: SetEnvIfNoCase User-Agent "^Baiduspider" RewriteCond %{HTTP_USER_AGENT} ^Baiduspider* You don't want to block UAs that begin with "Baiduspider". You want to block UAs that contain "Baiduspider". Right? Oh, and what are all those asterisks for? That is: what are they intended to be for? What they really do is allow inputs in the form "Baiduspide" "Baiduspider" "Baiduspiderrrrr" et cetera... so long as it's the first item in the string. I kinda think that isn't what you had in mind. The bad bots don't need to be inside a <Limit> condition. You want to lock them out all the time, don't you? Next item: why are you saying everything twice? RewriteCond %{HTTP_USER_AGENT} ^Baiduspider* (et cetera) ... leading to [F] using mod_rewrite SetEnvIfNoCase User-Agent "^Baiduspider" bad_bot (et cetera) ... leading to Deny from using mod_setenvif in combination with core (or mod_access depending on how old your installation is). You only need one or the other. My personal preference: use the Environment version if there's a nice short distinctive piece of the UA that works all the time without any further conditions or exceptions, like "Clipish" or "HTTrack". No ifs, ands or buts: they're out. If it's complicated-- "Block this UA if the string doesn't also contain this other word, or if it isn't from this IP"-- go to mod_rewrite. Blocking by IP (Deny from 1.202.0.0/15) is cleanest and simplest of all. Robots can change their clothes (UA) or lie about who sent them (Referer), but the IP can't be faked.
|
wilderness

msg:4419287 | 10:08 am on Feb 19, 2012 (gmt 0) |
You should change the "order allow,deny" to "order deny,allow" at least if you ever intend to use custom error pages.
|
arms

msg:4419399 | 9:46 pm on Feb 19, 2012 (gmt 0) |
Thanks for the replies I have changed to this: SetEnvIfNoCase User-Agent "Baiduspider" bad_bot SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot SetEnvIfNoCase User-Agent "Jyxobot" bad_bot SetEnvIfNoCase User-Agent "discobot" bad_bot SetEnvIfNoCase User-Agent "Plukkie" bad_bot SetEnvIfNoCase User-Agent "Ezooms" bad_bot SetEnvIfNoCase User-Agent "Exabot" bad_bot <Files *> Order Allow,Deny Allow from all Deny from env=bad_bot </Files> order allow,deny deny from 94.253. deny from 109.60. deny from 72.14.164. deny from 66.219.58. deny from 180.76.5. deny from 123.125.71. deny from 200.98.132. deny from 62.212.69. deny from 69.58.178. deny from 195.7.10.56 deny from 90.197.49.47 deny from 176.9.51. deny from 130.206.32.253 deny from 75.125.135.226 deny from 109.149.199.2 deny from 212.113.35.162 deny from 80.40.134.103 deny from 80.40.134.104 deny from 80.40.134.120 deny from 62.24.181.134 deny from 62.24.181.135 deny from 62.24.222.131 deny from 62.24.222.132 deny from 62.24.252.133 allow from all but the little sods are still coming through I took my lead from here: [webmasterworld.com...] but just can't seem to get it to work
|
g1smd

msg:4419403 | 10:07 pm on Feb 19, 2012 (gmt 0) |
Only one deny takes effect, the last one in the list. Check the syntax, especially the "setting of an environmental variable" method.
|
arms

msg:4419406 | 11:13 pm on Feb 19, 2012 (gmt 0) |
OK so I change to: SetEnvIfNoCase User-Agent "Baiduspider" bad_bot SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot SetEnvIfNoCase User-Agent "Jyxobot" bad_bot SetEnvIfNoCase User-Agent "discobot" bad_bot SetEnvIfNoCase User-Agent "Plukkie" bad_bot SetEnvIfNoCase User-Agent "Ezooms" bad_bot SetEnvIfNoCase User-Agent "Exabot" bad_bot <Files *> Order Allow,Deny Deny from env=bad_bot deny from 94.253. deny from 109.60. deny from 72.14.164. deny from 66.219.58. deny from 180.76.5. deny from 123.125.71. deny from 200.98.132. deny from 62.212.69. deny from 69.58.178. deny from 195.7.10.56 deny from 90.197.49.47 deny from 176.9.51. deny from 130.206.32.253 deny from 75.125.135.226 deny from 109.149.199.2 deny from 212.113.35.162 deny from 80.40.134.103 deny from 80.40.134.104 deny from 80.40.134.120 deny from 62.24.181.134 deny from 62.24.181.135 deny from 62.24.222.131 deny from 62.24.222.132 deny from 62.24.252.133 allow from all </Files> No change "Check the syntax, especially the "setting of an environmental variable" method." means nothing to me, I have exactly the same syntax as every example I have seen elsewhere including on this board
|
Pfui

msg:4419421 | 2:07 am on Feb 20, 2012 (gmt 0) |
Did you correct your [NC,OR] notations? Or did you opt to stop using mod_rewrite?
|
arms

msg:4419424 | 2:20 am on Feb 20, 2012 (gmt 0) |
Got rid of them my previous post is my total .htaccess
|
lucy24

msg:4419427 | 2:40 am on Feb 20, 2012 (gmt 0) |
| Only one deny takes effect, the last one in the list. |
| OK, what glaringly obvious thing am I overlooking? Incidentally, I put my environmental deny in the same place as the IP denys: Deny from env=keep_out Deny from 31.214.128.0/17 Deny from 38.100 ... et cetera, et cetera. Does <Files *> mean anything? I would have thought it's the same as not using an envelope at all. Oh, and I just say BrowserMatch. Saves several bytes ;)
|
g1smd

msg:4419667 | 7:17 pm on Feb 20, 2012 (gmt 0) |
There should only be one deny statement. If you have multiple deny statements, all previous deny statements are ignored and only what is in the last deny statement will ever apply.
|
lucy24

msg:4419688 | 8:30 pm on Feb 20, 2012 (gmt 0) |
| If you have multiple deny statements, all previous deny statements are ignored and only what is in the last deny statement will ever apply. |
| Still missing something, because that is precisely how my own htaccess is set up, and it definitely Denies from everyone on the list: Order Allow,Deny Allow from all Deny from env=keep_out Deny from 31.214.128.0/17 .... and so on down to Deny from 223.198.0.0/15 which is definitely not the only IP to get locked out. I'd have noticed. Allow,Deny First, all Allow directives are evaluated; at least one must match, or the request is rejected. Next, all Deny directives are evaluated. If any matches, the request is rejected. Last, any requests which do not match an Allow or a Deny directive are denied by default. |
| Emphasis mine. Are we talking about different things?
|
arms

msg:4419717 | 10:18 pm on Feb 20, 2012 (gmt 0) |
This is where I am now, still with no success SetEnvIfNoCase User-Agent "Baiduspider" bad_bot SetEnvIfNoCase User-Agent "AhrefsBot" bad_bot SetEnvIfNoCase User-Agent "Jyxobot" bad_bot SetEnvIfNoCase User-Agent "discobot" bad_bot SetEnvIfNoCase User-Agent "Plukkie" bad_bot SetEnvIfNoCase User-Agent "Ezooms" bad_bot SetEnvIfNoCase User-Agent "Exabot" bad_bot Order allow,deny deny from env=bad_bot deny from 94.253. deny from 109.60. deny from 72.14.164. deny from 66.219.58. deny from 180.76.5. deny from 123.125.71. deny from 200.98.132. deny from 62.212.69. deny from 69.58.178. deny from 195.7.10.56 deny from 90.197.49.47 deny from 176.9.51. deny from 130.206.32.253 deny from 75.125.135.226 deny from 109.149.199.2 deny from 212.113.35.162 deny from 80.40.134.103 deny from 80.40.134.104 deny from 80.40.134.120 deny from 62.24.181.134 deny from 62.24.181.135 deny from 62.24.222.131 deny from 62.24.222.132 deny from 62.24.252.133 allow from all
|
wilderness

msg:4419856 | 4:20 am on Feb 21, 2012 (gmt 0) |
First and foremost, this thread belongs in the SSID Forum </rant> | This is where I am now, still with no success |
| What exactly is NOT working? What isn't working as you intended? Have you checked your error logs? 1) Before you begin with htaccess perhaps you should improve both your copying and pasting skills, and interpretations skills? 2) Your environment variable does not even match the thread you quoted with a link. a) your code: Order allow,deny deny from env=bad_bot deny from 94.253. allow from all b) your link code: SetEnvIfNoCase User-Agent "^Zyborg" bad_bot <Limit GET POST HEAD> Order Allow,Deny Allow from all Deny from env=bad_bot </Limit> The Apache mod_access page [httpd.apache.org] provides the following example partially don the page and in the Deny section: Order Deny,Allow Deny from all Allow from apache.org The code you used does not match either your cited link or the Apache example. 1) Your are not consistent with you use of upper and lower case. Use or one method or other, however don't mix both. EX: Deny from deny from Some new server may prove quite picky. 2) start with a small file and get it fucntioning and then go back and add more UA's and IP's. EX (although and as I provided previously you should be using the variable Order Deny,Allow: <Limit> SetEnvIfNoCase User-Agent Baiduspider bad_bot Order Allow, Deny Deny from 94.253. (or an IP the may verify easily Allow from all Deny from env=bad_bot </Limit> There have been some recent threads in this forum and the SSID which express the issues created in raw logs when using the "Files" container. I also expressed (recently) similar issues in the SSID forum when using quotes with SetEnvIf, thus I suggest you stop using the quotes on every line and use them quite sparingly and when you do progress with your skills enough to benefit from "exactly as".
|
lucy24

msg:4419873 | 4:54 am on Feb 21, 2012 (gmt 0) |
Edit: Oops. Not sure why my tiny little post took over half an hour to put together, but yup, I'm overlapping. Necessary backtracking: Is it your own server or shared? If shared, are you allowed to have fully functional htaccess files? ("Allowed" here means "it will work".) What happens if you add your own IP to the "deny from" list? Do you just cruise on in as if nothing had happened? Any change if you put it into Title Case ("Allow,Deny")? Most Apache installations don't care, but occasionally one does. I was apprehensive about the trailing . in some of the partial IPs, but checked it and it doesn't seem to make any difference.
|
wilderness

msg:4419968 | 2:05 pm on Feb 21, 2012 (gmt 0) |
| I was apprehensive about the trailing . in some of the partial IPs, but checked it and it doesn't seem to make any difference. |
| There's some very old discussion on this, however I've no clue what to search that would result in the threads. I use the trailins DOT in everything for "deny from IP's". The early logic was similar to a trailing slash in robots text (include all). I seem to recall some early access errors (2000 or 2001) when omitting the DOT. Apache and the change from POSIX to PCRE has an effect, I'm sure. In any event the logic is that it works either way in present day, or so Jim stressed many times.
|
g1smd

msg:4420122 | 7:48 pm on Feb 21, 2012 (gmt 0) |
| Are we talking about different things? |
| Uh, yeah. I misremembered some detail, and didn't take time out to check the facts.
|
|