Forum Moderators: phranque

Message Too Old, No Replies

Disabling rewrite for ALL URL's for a given path?

Setting a custom 404 page for a subdirectory that should not rewrite.

         

JAB Creations

8:55 am on Feb 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My site uses an Apache rewrite however I host a friend's site.

So...

example.com/
example.com/friend/

My Apache rewrite makes an exception for their their directory and sub-directories however when I go to an error 404 URL I see my main site's rewritten URL. I'd like to instead see a custom 404 page described in the .htaccess file in their directory as so...

example.com/friend/
example.com/friend/.htaccess
example.com/friend/404.html

...that is not the 404 or rewrite page from my main site.

example.com/.htaccess
RewriteEngine on
RewriteRule ^(friend1\/|friend2\/) - [L]
RewriteRule !\.(css|html|mp3)$ index.php


example.com/friend1/.htaccess
RewriteEngine off
ErrorDocument 404 /friend1/404.html


I tried using * in the my main .htaccess which seemed to have no effect. I also tried the following which borked the entire domain...

example.com/.htaccess
<Directory "/friend1">
AllowOverride All
</Directory>


I'm testing a missing mp3 file if that helps, I just want to get all the missing files handled by their directory specific 404 page without my rewrite completely ignoring that directory regardless of HTTP status code/number. Thoughts please?

- John

jboy

2:10 pm on Feb 6, 2011 (gmt 0)

10+ Year Member



Just to say, I don't think <Directory "/friend1"> kind of thing is OK in .htaccess files, only php.ini I think. Not allowed in .htaccess files hence "borked the entire domain".

Just an alternative suggestion, not an answer to your question: it sounds like a subdomain for your friend's site would be possibly a better way to organise it, would avoid the kind of problem you're asking about, but you might have a reason not to do that.

jboy

2:52 pm on Feb 6, 2011 (gmt 0)

10+ Year Member



I'm not very good at rewrite stuff but I've just thought: because you're talking about files which don't exist, could a condition which says "only if the file exists" work to filter out non-existant file requests from being effected by the rewrite. So something along the lines of:
RewriteCond %{REQUEST_FILENAME} !-f
before your rewrite rules ? Something along those lines maybe. Worth a go to see.

JAB Creations

11:08 am on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I could try using a subdomain though that would be like trying a different car because your first one ran out of gas. I'm also not sure if an Apache redirect would endlessly loop?

I'm still looking around online to see if others have figured this out, I can understand regex to a modest degree though it's not like you can echo or alert things with Apache.

- John

jdMorgan

11:09 pm on Feb 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't understand the question -- and therefore, the problem.

> when I go to an error 404 URL I see my main site's rewritten URL.

Define and post specific examples of both, please.

For example, I'm trying to determine if, when you request a non-existent /friend1 URL, you're seeing contents generated by your main index.php script (to which only non-friend1 URLs should be rewritten), or if you are actually seeing a client redirect taking place...

Also, be sure to delete your browser cache before testing any new server-side code.

Jim

jboy

11:23 pm on Feb 7, 2011 (gmt 0)

10+ Year Member



> I could try using a subdomain though that would be like trying a different car because your first one ran out of gas. I'm also not sure if an Apache redirect would endlessly loop?

can't resist a car analogy. no, it'd be like changing to an electric or hybrid or small car having bought a rediculously oversized suv -- you'd do that because in the long term it's a much better idea :) just seems to me a subdomain is designed exactly for this kind of thing. i was told, and i don't know how true this is, that turning off apache settings for further down the directory hierarchy is very hard. so once set in example.com/.htaccess, hard/impossible? to shield for files in example.com/dir.

i don't fully understand the question either. to start with i thought it was just wanting a 404 page for one "site", and a different 404 page for the other "site". but i'm not so sure now.

JAB Creations

6:14 am on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Jim, a friend I gave a free hosting account moved a few MP3's of songs he made which are generating a lot of noise in my rejection logs. I use the rejection logs to make sure I don't deny human visitors (with my human/bot detection) or good bots access (I love seeing what legitimate though very miscellaneous robots come for a visit) and it also helps me understand the patterns of various spammers/scrappers/etc. So I've been trying to set the 403/404 error pages for his subdirectory to be different from what I use. I also plan on making separate logs for 403 and 404 errors for further analysis as the logs I've already created have helped me fix a lot of issues with the latest version of my site though again the errors from his subdirectory would add a lot of noise to those logs. I've tried to set overrides in both the root .htaccess file as well as the copy in his subdirectory, used the (.*) regex after friend1\/ path in the exception as well as various other regex patterns both in the same RewriteRule list of exceptions and as a standalone RewriteRule without any success. I figured there would be some sort of regex that would include any and all paths that start with that directory and I've even come across a few threads here on WebmasterWorld though nothing so far has worked. Apache only seems to match the directory alone though nothing in the directory. The server is Apache 2.2 if that helps any. I spent some time looking through the Apache documentation for rewrites though I couldn't find the pages for the flags (e.g. [L]) to make sure it's not tampering with the rule or if I'm missing a flag that I need.

Also yes, if I make a request that I know will result in an HTTP 404 response I'll see my site's 404 handler page. I'd like to make sure the 404 page set in his subdirectory's .htaccess file is used instead. Another thing I've tried is setting the 404 page for his directory in my root .htaccess file without success.

jboy, actually I've long hated cars though figured it would be a good analogy. Over the years I've tried to make as few easy work-arounds to achieve my goals because it's forced me to learn things as I haven't been able to depend on other people and third party software doesn't go the distance I need and want to go. There's also the issue of whether switching an established subdirectory to a subdomain is beneficial or not though that's a different topic (and actually an interesting one). It's ultimately about self-correcting my code while also making sure humans see content, good bots see the same content and everything else doesn't.

- John

g1smd

3:33 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I still don't understand the question.

If user requests non-existent URL like "X", the user should see the 404 page found in the server filesystem at "Z".

If user requests non-existent URL like "P", the user should see the 404 page found in the server filesystem at "Q".

Things like "his folder" and "this file" might make sense to you, but as we don't know how your sites are configured, you need to clearly explain these details, with examples.

Fill in the blanks with real URL paths and real server paths as jd requested above.

JAB Creations

4:59 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



**This post has two parts**
-- Part 1 --
example.com/friend1/*.*

Like DOS, dir *.* in the directory "friend1", that kind of everything and anything.

So no matter what the path is if it is or contains "friend1" I don't want Apache to apply the rewrite.

example.com/friend1/ <-- file exists with file extension match so not rewritten
example.com/friend1/1 <-- if 404, is rewritten, don't want that
example.com/friend1/1.gif <-- if 404, is rewritten, don't want that
example.com/friend1/a <-- if 404, is rewritten, don't want that
example.com/friend1/z.zip <-- if 404, is rewritten, don't want that
example.com/friend1/404-file.html <-- if 404, is rewritten, don't want that
example.com/friend1/wt24t42t42t <-- if 404, is rewritten, don't want that

I don't want any URL in that directory to be rewritten period.

Instead I want that directory to use it's own HTTP 404 designated file.

In example.com/friend1/.htaccess I set...

ErrorDocument 404 /friend1/404.html


I want his 404 handler to kick in because then the errors with him moving/renaming/deleting files won't show up in my logs. Since Apache won't stop rewriting all of his URL's that aren't found every single 404 error in his directory is showing up in my log because the rewrite it catching those errors. I want to prevent that and have his own 404 handler deal with the errors. HTTP 200 files (e.g. his images, mp3's, etc) aren't rewritten as they match the file extension rules so there's no problem there unless he uploads a file with an extension that isn't given an exception. So it's missing files that don't exist that aren't found on the file system and thus Apache rewrites the URL to my own site's index.php at the public root that pulls my CMS and logs it as an error.

Maybe a linear example...

Time line...
1.) REQUEST: example.com/friend1/a-404-file.mp3.

2.) Server does not find a file, rewrites to example.com/index.php because we don't know how to prevent *.* (DOS style) from being rewritten for the entire directory.

3.) My CMS is called when example.com/index.php is accessed in rewrite.

4.) My CMS does not find the request path in the database.

5.) My CMS logs the request as a reject (MP3 files are only stored on my domain in the file system, since nothing would be found in the database it's automatically rejected).

6.) Media player user agent shows up in my reject log.

So my reject log is being flooded with his 404 errors.

Perhaps asking my question in a different context?

Can I instead of adding a *.* exception to the directory somehow simply override the Error 404.html file for his directory?

In example I have a few cases where I use a subdirectory (e.g. blog and forums) to handle requests differently. Now that I think of it I could use that however it would be a weird/unintended context...

A) example.com/.htaccess
B) example.com/friend1/.htaccess

-- Part 2 --

So I gained some insight after a couple things fell in to place...I still have the problem though now I have something I can focus on.

So I used the same rewrite in the main inside the subdirectory .htaccess with some success after testing a 404 URL. However when I tested it with the test 404 URL that I've been using it still was using my rewrite and after a moment I realized some of the characters don't quite look right.

example.com/friend1/some-404.mp3 <-- uses the desired 404

example.com/friend1/Friend1%25mp20-%25mp20Mother.mp3 <-- pulls my main site's rewrite

So I removed the % characters from that second URL and the 404 error page I set in the subdirectory .htaccess file to 404.html worked!

The problem for me is that looking at my reject logs at greater depth and looking at the origins of the IP addresses that are making these oddball requests is that they're bots based out of China. *sigh*

So I know if the character % appears in the URL it throws off the exception and my rewrite rule kicks in. Is there something I can do to force the exception to work for his subdirectory when odd characters are still present? Thanks for all the replies.

- John

g1smd

5:20 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



To stop certain URLs being rewritten to your CMS use a negative match RewriteCond excluding those requests. The content will then attempt to be pulled from the applicable folder, and if not there, the 404 page defined in .htaccess will be used.

JAB Creations

5:38 pm on Feb 8, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This doesn't work though am I on the right track?

RewriteCond %{REQUEST_URI} !%


- John

JAB Creations

2:09 pm on Feb 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I've been spending hours trying to decrypt Apache and I've learned a few things...

[L] flag indicates that the end of the rewriting process has been reached; are flags only used with RewriteRule? When using [L] on first RewriteRule how does it relate to the second RewriteRule? Does the RewriteCond command ignore those flags?

I can do some regex though I'm not entirely sure about how to match the % character?

RewriteCond %{REQUEST_URI} !^[\%]$

! = does not, so we wouldn't apply this is we found the following match.

^ = Starts with
[ = anything that matches the following?
\% = Do we have to escape the percentage sign in Apache? I haven't seen it listed as a literal so I imagine it would be [%]...how would we say match the string in any condition with [%]?
$ = anything ending.

Do we need to use ^ or $?

I'm not sure about the first % before {REQUEST_URI}.

It's the structure I'm having difficulty understanding as well as how each line relates. If I can grasp that I think I might be able to answer more questions than I ask.

- John

jdMorgan

7:05 pm on Feb 17, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




Options -MultiViews
#
RewriteEngine on
#
# Deny access if un-decoded client request line contains any "%" character
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /[^%]*\%
RewriteRule ^ - [F]
#
# If the requested URL-path starts with "friend1/" or "friend2/" skip all following rules
RewriteRule ^(friend1|friend2)/ - [L]
#
# Rewrite all requests to /index.php, except for css, html, mp3 filetypes
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteRule !\.(css|html|mp3)$ /index.php [L]

I'd also advise looking into whether you really want to rewrite image, robots.txt, and .xml file requests to your index.php file...

Jim