Forum Moderators: phranque
i am trying to set up a php-mysql based lyrics site. first (before mod_rewrite) i am just thinking of make '/show_lyric.php?id=12345' type php template page for retrieving lyrics from mysql db. but i heard that SE's dont like these type of urls and dont cache them very good. after some investigation i come up with solution having pages like '/artist_name/album_name/name_of_the_song.htm'. in constructing anchors for the actual page i can generate these fake urls. but in mod_rewrite side i am a totally newbie to this module (actually to apache server :( )
i just prepared my huge lyrics mysql db and made the graphic design. now i must do some real job by solving the problem i figured above. thanks for any suggestion (even any word about this)
There's nothing wrong with using database record numbers to ease your job at db maintenance if you need them, but set up your database so that you can also pull records using the /artist/album/song format. If you do that, you will need only one or a very few simple rewrite rules, and you will have to maintain only one database.
For example, the following rewrite will translate a URL in the form "example.com/lyrics/U2/The_Joshua_Tree/In_the_Name_of_Love" to
"example.com/find_lyrics.php?artist=U2&album=The_Joshua_Tree&song=In_the_Name_of_Love"
RewriteRule ^lyric/([^/]+)/([^/]+)/([.+])$ /find_lyrics.php?artist+$1&album=$2&song=$3 [L]
The critical thing is that you don't want to have to re-invent the wheel. You want to discuss songs with your friends using artist/album/song. You want to link on your pages with text relevant to artist/album/song. It would not be natural for you to discuss "database entry 1234" or "database entry 1234565678999" with your friends. These would also make lousy links if you want to get your pages indexed in search engines.
At the same time, mod_rewrite and scripts are only good for re-formatting data. They are not good at doing "lookups" -- in this case taking "/lyrics/U2/The_Joshua_Tree/In_the_Name_of_Love" and 250,000 other titles and translating them to "id-1234" and other record numbers. That's a database job.
So that's the number one thing to do to make your life easier - make the database do the work. Same goes for any dynamic site database, whether it's the product catalog for "Wide World of Widgets" or for a lyrics site. Let the database do the work, not you.
Jim
then i tried this RewriteRule you gave but couldn't get it work properly:
RewriteRule ^lyric/([^/]+)/([^/]+)/([.+])$ /find_lyrics.php?artist+$1&album=$2&song=$3 [L]
(i fixed ....artist+$1.... to ....artist=$1.... this is perhaps a mistake)
but i compiled a new one which works perfectly:
RewriteRule ^lyrics/([^/.]+)/([^/.]+)/([^/.]+)\.(htm¦html)$ /find_lyrics.php?artist=$1&album=$2&song=$3 [L]
i want to ask you if this rule is good enough to use? is there any mistake, or sth that will arise any error in the future?
btw what is the difference between these :
([^/.]+) and ([^/]+) and ([.+])
thanks.
If you use a subpattern like ".+" or ".*" at the beginning or in the middle of a complex pattern, then it will initially grab (match) all the characters in the string. The regex parser will then realize that it cannot match the remaining subpatterns without "backing up" through the previously-matched string. After many iterations, it will finally figure out that the slashes denote the boundaries between subpatterns. It *will* work, but it's much faster if you simply *tell it* where those boundaries are. If you do, pattern-matching can proceed in one pass from left to right in the string.
<soapbox>As you may notice from my other posts in this forum, I have declared war against the unnecessary use of ".*", because it is the most ambiguous, least efficient possible pattern, and should be used only where actually required. It's sole virtue is that it is easy to learn and use, but this comes at the cost of very bad regex processing performance when used in complex patterns. I therefore promote the use of forward-looking negative matches like [^/]+ to increase efficiency.</soapbox>
Your new pattern looks OK -- I see no need to include the "." in "[^/.]". Just be aware that there is no rule that says you must name your virtual pages with an ".html" extension -- or any other extension. As a matter of fact, the W3C is promoting the use of URLs with no filetype to allow for seamless upgrades in technology, for example from html to php. Certainly, if you are going to include file extensions in your new plan, you should pick one and only one -- .htm or .html -- and stick with it.
Jim
my project has nearly finished. but i have recently noticed that some of my artist names in db contains .(dots). Then it returns wrong parameters for generating .php url. i am using this sort of rules:
RewriteRule ^([^/.]+)\.htm$ /artist.php?artist=$1 [L]
What can i do for avoiding this error? Thanks in advance.
RewriteRule ^(.+/)?([^/]+)\.htm$ /artist.php?artist=$1 [L]
Jim