Forum Moderators: phranque

Message Too Old, No Replies

Using Apache <If> Directive to Deny Access to any WP Feed URI

Laboring on moving everything from htaccess to pre-virtual include

         

Webwork

10:31 pm on Mar 22, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



First, an honest hat tip to those who speak the language of Linux, Apache, Perl, Regex, etc. I thought "the law" was a Gordian knot. #%$! This stuff, with its exacting syntax requirements, multiple dependencies, endless variations on the "means to an end" (Do it this way. No, do it THIS way!) reminds me of the U.S. Bankruptcy Code, where every section of the Code is connected to and conditioned by 7 other sections. Argh! IF I wasn't using this effort to learn a bit of coding, to atone for my $hitty attitude whilst studying high school algebra, quadratic equations, etc, I would just . . walk away. :-/

Okay, to prove I'm not being a lazy a$$, just a dumba$$, I will admit that I've just put about 5+ hours into this simple (ha-ha) ~experiment, after putting countless more hours into reading, reading, reading Apache.org, tutorials, etc. (I'm about 100+ hours in by now. I know, more to go.)

Motivation: I have all manner of bots probing for "/this/feed" or "that/andthat/andthat/feed/". Sometimes they hit. Other times they miss. I'd like to kick all /feed requests to the curb, without using a plugin. Just kill them before they get to WP.

I played with inserting this (code) below in a <Directory /home/*/public_html/> wrapper but, no matter what I tried, I kept getting an error message suggesting that there was a clash between the <Directory . . .> <If . . . > Require all denied AND a next-in-sequence <FilesMatch> directive that nicely has been blocking access (Require all denied) to certain files on its own. It was as if I didn't "close" the prior rule/arguments but I did, as far as I could could see. (Checked for ", }, ), /> you name it.)

When I removed the <Directory> context Apache didn't choke on the <If> directive (below) as a stand alone directive . . BUT. . I had a nagging feeling that said "acceptance" (no error message) was because my <If> directive was a fekless (<- no c's allowed) directive, as in "a directive that won't do what I thought . . and is probably doing something I'm not expecting".

SO, I just browsed to example.com/feed and example.com/category/feed/ and . . . dangit . . a ~feeds loaded.

If you will, what am I missing . . in order to effectively deny, by rule, all access to any ~/feed | /feed/ via a http_request_uri ?

Am I a complete idiot in thinking that there IS such a "directory"? That it's actually a "file"? ("Maybe that's it", he sheepishly thought to himself, after publicly confirming his utter lack of understanding.) Webwork quickly adds "feed" to <FilesMatch> directive to test. No effect. ~feed loads. Argh.

<If "%{REQUEST_URI} == '/feed/?'">
Require all denied
</If>


Note: I'm running Apache 2.4

P.S. When using an "access provider" such as "Require all denied" - is not the access provider limited in scope/effect "to the condition" to which it is related. As in <If> Argument -> Consequence </If> -> NO effect on any other request(s)? Limited to that little chunk of requests, i.e., requesting /feed/?

P.S. To Lucy24: I've given up my <Location> ways. I get the (potential) problems.

Thanks.

lucy24

12:42 am on Mar 23, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Honestly, in this case I'd proceed directly to mod_rewrite. And that's not something I say every day ;)

RewriteRule \bfeed/ - [F]
and that's all. Put it wherever you've got your existing RewriteRules that will be "seen" by the whole site. Since you actually do have WP,* the obvious choice is before the WP envelope. Access control, [G], [R], [L] in that order, with some file-specific exceptions. (The [L] basically means WP if you've got it.)

The \b means "word break", since it looks as if the /feed/ element can come at various points in the path. Using an anchor is more straightforward than the whole package of ([^/]+/)* which you'd only need if you're capturing. That's assuming you don't have perfectly legitimate URLs called, say, news-feed/more-stuff-here or farm.feed/rest-of-path. (A hyphen counts as punctuation for RegEx purposes, though a lowline doesn't.)


* I don't, but have found it useful to add rules saying
RewriteRule \bwp\b - [R=404]
RewriteRule ^admin - [R=404]
where the request gets the identical 404 that it would get if the rule didn't exist-- but this way the server doesn't have to go look, and the robot isn't getting an explicit "Don't bother, we're onto you" message. Or at least it won't for another couple of weeks, until I roll out my new improved access-control system. Har, har.

Webwork

3:31 am on Mar 23, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



I'm now officially (temporarily) mind boggled, for it just dawned on me that this isn't a feed "directory", nor a feed "file", anywhere in the WP filesystem.

So how on earth does one limit or deny access to non-existent directories and/or files?

I hope my question and related bafflement . . err . . makes sense?

lucy24

4:22 am on Mar 23, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So how on earth does one limit or deny access to non-existent directories and/or files?

Well, you can't do it with <Files> or <Directory>. (If you want to split hairs, <Files> does work on nonexistent files, but only if their URLpath ends up in a real directory*.) But you can do absolutely anything in mod_rewrite. And you can do a fair amount in mod_setenvif by looking at Request_URI -- although not in the specific case of WP or similar CMS, since you need something that has already issued the 403 before it reaches the CMS stuff. mod_setenvif typically executes before mod_rewrite -- but it can't issue 403s on its own behalf. For that you have to wait for mod_authwhatever, which comes later. Of course, you could also use mod_rewrite to issue a 403 based on environmental variables, but if the lockout is based on the requested URI, you shouldn't need to)

* I don't know if this is discussed anywhere in The Docs. I discovered it empirically, probably while trying to do something else. And I don't guarantee that it will work on anyone's installation except mine.