Forum Moderators: phranque

Message Too Old, No Replies

Directory Recursion: How to rewrite dot-dot & missing slash patterns?

../

         

Pfui

11:23 pm on May 24, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Every month I discover more and more folks using aggressive, quasi directory- and/or code-crawling programs, robots, extensions, whatevers. Very few have specific IDs so I'd like to stop them all from the get-go rather than play catch-up after the fact. Thing is, rewriting these hit patterns has me stumped:

1.) The dot-dot pattern in sub-directories --

Should NOT end with or include /..

(HTTP_REFERERs)

"http:// www.example.com/dir1/dirA/.."
"http:// www.example.com/dir2/dirA/.."
"http:// www.example.com/dir5/../dir2"
"http:// www.example.com/dir6/../dir3/dirA/../../dir5"

(GET)

/dir1/../dir2/file.gif

-- AND --

2.) The no-trailing-slash pattern in sub-directories --

Should NOT end without /

(HTTP_REFERERs)

"http:// www.example.com/dir3/dirA"
"http:// www.example.com/dir4"

The latter isn't directory browsing per se because that's disabled in httpd.conf. Also, the programs are mining directories that all have default welcome.html or index.html pages but they're somehow getting around those and crawling the intra-linked contents.

Normally, if someone hits, say --

http:// www.example.com/dir1/dir2

-- they're automatically 301'd to --

http:// www.example.com/dir1/dir2/

-- at which point they 'see' each dir's specific default .html. (Confused enough yet? Sorry!)

So anyway, this regex-challenged person figured out the grep patterns --

1.) "\.\."

2.) dirnamehere/\"

-- but I'm unsure of how to mod_rewrite those patterns (...on a live site which I already inadvertently suspended x2 hours earlier this week with a misplaced [OR]. Oy!).

Is this A-OK for (1.)? Will this forbid ".." with OR without a trailing slash?

RewriteCond %{REQUEST_FILENAME} \.\.
RewriteRule ^.*$ - [F,L]

Alas, I have no idea how to tackle (2.) and REQUIRE a trailing slash for ALL dirs because it happens automatically -- or should.

Help, please? TIA!

jdMorgan

3:37 am on May 25, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This should do:

RewriteRule \.\. - [F]

Or, alternatively:

RewriteRule [.]{2,} - [F]

For trailing slashes, I'll make the assumption that you have no directories with "." in the URL-path -- only files will have a dot in them, in other words:

# If no period in URL-path (must be a directory)
RewriteCond %{REQUEST_URI} [^.]+
# then forbid anything with no slash at the end
RewriteRule [^/]$ - [F]

The above approach, although not as fail-safe as checking for 'directory exists,' is a lot faster.

And practically speaking, you should be able to condense that to:


# Forbid anything containing no periods AND having no slash at the end
RewriteRule ^[^.]+[^/]$ - [F]

Jim

Pfui

1:08 am on Jun 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for your help, Jim! I only wish I could say I got something to work. :(

(Sorry for the debugging-worthless statement. I did start out keeping track of what I was testing but after what seemed like 20 permutations, I yielded to fatigue and frustration.)

I finally ended up using this to stop double-slashes (have been trying to mod_rewrite it for too long):

SetEnvIfNoCase Request_URI "//" no-way

Alas, I've yet to hit on a SetEnv solution for no-slashes (can you even do that with SetEnv?), ditto the dot-dots:

## DID NOT WORK: SetEnvIfNoCase Request_URI "\.\." no-way
## DID NOT WORK: SetEnvIfNoCase Request_URI "/\.\./" no-way

FWIW, I did try these --

RewriteRule \.\. - [F]
RewriteRule /\.\. - [F]
RewriteRule /\.\./ - [F]

-- and this --

RewriteCond %{REQUEST_URI} ^/\.\./$
RewriteRule ^.*$ - [F,L]

-- and this --

RewriteCond %{HTTP_REFERER} ^http://(www\.)?example.com/.*$ [NC]
RewriteRule \.\. - [F]

(Stop laughing. I was getting desperate, dazed 'n' confused.)

And I emptied my cache and rebooted and uploaded and tested and lathered and rinsed and repeated umpteen times, too.

Shoot.

When it comes to access control configs, I sure hope a light bulb will shine brightly above my head one of these days. If time spent = knowledge acquired, I'd be the Albert Einstein of mod_rewrite by now!

All I can say is thank goodness you already are:)

jdMorgan

2:03 am on Jun 1, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Actually, it looks like most of your several attempts should have worked. So the question becomes, "How did you test?"

If you used a browser, be aware that the browser will resolve /abc/../def to /def before sending the request to your server. So, you will likely get a 404 or some other unexpected results, rather than the desired 403. You might want to try a few on-line server header checkers and UA spoofers -- you'll likely be able to find one that will actually send the 'raw' "../" and "//" to your server, instead of resolving or 'correcting' them before transmission.

The only 'flaw' (and it's minor) that stood out in your code was that it's rather a waste of CPU to use SetEnvIfNoCase with a pattern that contains no letters... :) So in this regard, and to use a quote from the famous scientist himself, "Make everything as simple as possible, but no simpler," and use SetEnvIf in those cases instead.

Jim