Forum Moderators: phranque
We have one million files in a directory.
The url domain.com/ is slower to resolve (1 minute) when compared to domain.com/index.html (20 seconds).
I guess this is because Apache has to parse the entire contents of the directory.
Could someone please suggest ways on howto improve performance so that domain.com/ is read as fast as domain.com/index.html?
We're running Apache v2 on a linux VPS/VDS.
TIA
Options +FollowSymLinks
RewriteEngine On
RewriteRule ^([^/.])([^/.])([^/]*)$ /$1/$2/$1$2$3 [L]
This will confuse the default apache error handling a bit (you could add an extra RewriteCond on top of the RewriteRule to circumvent that, but I'd only suggest that if the website does not have many hits), but it will still manage. If you want to keep some files out of this directory nesting, then you will also need to use a (set of) RewriteCond(s) on top of the RewriteRule. See the mod_rewrite documentation [httpd.apache.org] for more information.
By the way if you are running this on a dedicated server, the consider reviewing your Apache configuration, trim down DirectoryIndex, disable .htaccess, and you can do even some more fine tuning, which makes your website run faster in general.
Also, the given example shows 'splitting' the URLs into two levels of file directories. With a million files, you might want to consider splitting them further -- into three or four levels. At three levels, assuming an even distribution of filenames among the letters of the alphabet, you'd be down to under 400 files per directory, which Apache can handle with aplomb. However, if the number of files is expected to grow, then four or even five levels would be recommended.
In each case, it will also be necessary to take the length of the URL-path into consideration; That is, if a one-letter URL were requested with the above code in place, then it would not be rewritten at all, since the rule pattern (which requires a minimum of two letters) would not match. So, you may also want additional rules to support shorter URLs.
Jim
And besides, what needs to be avoided is someone looking at this thread, and saying, "Great, I'll split my files into five directory levels using this (one) rule... Hey, why don't these shorter URLs resolve to the proper directory level?"
Jim
[edited by: jdMorgan at 10:42 pm (utc) on Mar. 6, 2008]
Since the URLs remain the same, I am curious about your method.
I guess it would be possible to store example.html in the directory domain.com/e/example.html correct?
Btw, does this mean that example.html can be accessed in two ways?
domain.com/example.html and domain.com/e/example.html?
TIA
Yes, you could indeed access the files in the 'structured' way, but as long as you do not exploit this to the world (ie. you did not start to use it yourself in your links), then it will remain behind the scenes. Besides, you can easily stop anyone doing that (either by denying such a request, either by redirecting to the proper address) with mod_rewrite.
The example I sent earlier is for two levels directories, if you go for more levels, the rules needs to be slightly adjusted.