|mod_rewrite performance issues?|
Need to decide
Im new here and first want to say this forum has already been a great resource for me so thanks in advance!
I am redesigning my site and have an entirely new directory structure, yet have thousands of links in from my 7 years of old url's. I am looking to map the old urls using mod_rewrite. Since I've discovered mod_rewite it also looks like it would be a better idea to change all my new formatted urls using it as well.
So the question I have is there any performance issues, like server load, I should be aware of before turning every url on my site (maybe 10,000) over to mod_rewrite?
Welcome to WebmasterWorld!
The answer to your question deppends on many factors, such as how much traffic your site gets, whether it shares server resources with other sites (and how many), the number of RewriteRules required to to re-map your old URL/directory structure to the new, whether you use PHP, PERL, or SSI/CGI scripts to serve content, how many images each of your pages serve up, and whether they'll also be rewritten, etc.
On a site getting 10-20 thousand hits (and yes, I mean hits, not unique visitors) per day, on a shared mid-priced server, using several hundred RewriteRules, and using a limited amount of PERL, PHP, and SSI scripting, I find the impact to be utterly neglible, but your mileage may vary.
You might consider a "trial" implementation if you're really worried about it -- rewrite just a few of your popular pages with the current URL structure to a slightly-different structure. Say you've got most of your pages in a subdirectory called "content." Rename that directory to "contents" and add a few RewriteRules to rewrite the requests so that old URLs get the same content as before. By making these test RewriteRules intentionally more specific than they need to be --say by rewriting each URL individually, instead of taking advantage of mod_rewrite's capability to do the whole lot with one rule-- you can guage the performance impact.
Writing the code is easy, it's the design of preliminary experiments and final tests that are hard... :o
Unless you've got a very busy site, I think you'll find that there's no or little impact. Sites which use unoptimized PHP scripting to serve all content use a lot more CPU than a bit of mod_rewrite code -- and remember that mod_rewrite is basically 'native' to Apache server, so it's fairly efficient.
All this to say, "You'll have to test for yourself and find out." ;)
Thank you very much for your response. Let me provide you a bit more information about my setup if it can shed any light.
We get about 10,000 - 15,000 users a day. All pages are served dynamically with PHP (though we cache what we can, and in the process of trying out Zend Optimizer and eAccelerator for additional caching).
P4, 3Ghz, 3gigs Ram, dedicated server. Mail done on a seperate server. Apache2, PHP 5, Gentoo, Mysql 4.
I've been running some tests on our other stuff just to see apache performance and noticed sometimes it really varies. Some pages go from 2% cpu to 7% cpu just refreshing (just me on the box)... so im trying to get as much advice as possible to plan this thing correctly :)
I suspect that PHP/SQL will "swamp" the effects of adding mod_rewrite rules on your server. Plus if you've got the whole box, you can do your rewriting in httpd.conf. mod_rewrite code in httpd.conf is compiled at server restart and is a *lot* more efficient than mod_rewrite code in .htaccess (per-directory) context where it is interpreted on a per-HTTP-request basis. Also, in httpd.conf you have the option to use RewriteMap if your rewriting is highly complex.
Since it sounds like you've got a "big boy" setup there, you might also consider setting up a second server with the new URL scheme and mod_rewrite code on it. Set your DNS Time-To-Live to a short period, wait for the current (probably long) TTL to expire, and then switch the DNS over to the test server's IP address during off-peak hours. At the first sign of trouble or when your most-business-critical time of day approaches, switch the DNS back to your main server. This way you can gather "real-world" data on the impact with little work and little risk to your business.
For best results, just make sure you understand the difference between efficient and inefficient mod_rewrite and regular-expressions coding techniques, put your RewriteRules in most-frequently-accessed-first order, use the [L] flag unless you have a specific reason not to, and use internal rewrites as opposed to external redirects when it makes sense to do so -- for example on images and other resources whose URLs aren't listed in major search engines' search results.
This technique presents the opportunity to do a "phased" roll-out. Since most people's main concern is the ranking of their "page" URLs in search results, only those URLs need to be externally redirected; Once the search engines have picked up the new page URLs and the number of page redirects starts to taper off, you can then add redirects for images and other resources that are typically not listed in search results, just to accelerate the complete switch-over. This introduces some planning and complexity, but can sometimes be useful for 'borderline' server-load situations on busy sites.
As you can see there are many ways to hedge your bets. Or you can just go for broke and switch everything all at once. You'll have to decide based on a bit of preliminary testing and your personal/corporate risk tolerance.
Thanks so much. Is there a good tutorial for good mod_rewrite techniques or the manual is fine?