Forum Moderators: phranque

Message Too Old, No Replies

Custom Rewrite Rule

A different way to handle directories is needed.

         

movieclub

11:03 pm on Jun 28, 2004 (gmt 0)

10+ Year Member



We currently have 50,000 users at our website and it is growing by 500+ each day. I setup half in a user directory and put in rewrite rules to handle this a couple of months ago. This was done due to the 32,000 directory limit per directory on ext3.

Each user has a directory like

./bb0003190

where bb is the users initials and the id is a sequential number.

And the directory has an index.php that sets a variable to bb0003190.

This allows the user to have a url like

http://www.example.com/bb0003190

We want to leave the url alone and I want to automate the and drop the users into directories like...

./aa/
./ab/
./ac/

./bb/

./zz/

This could end up in the example as

./bb/bb0003190/

or something like that and repoint them based on the first two characters of the users id (bb0003190)

This will make the 32k limit a non issue for the next 50 years or so.

How would I do this with rewrite rules or do you have any better suggestions. We are currently adding 500+ users a day and I will need to get this fixed shortly.

Thanks for the help.

Bob Bowen

[edited by: jdMorgan at 11:18 pm (utc) on June 28, 2004]
[edit reason] Removed specifics and sig per TOS [/edit]

jdMorgan

11:26 pm on Jun 28, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Bob,

Welcome to WebmasterWorld [webmasterworld.com]!

This is a fairly simple problem, but the exact solution depends on various details, such as:

  • Are the user initials always two letters, or could there be more or less?
  • How do you handle users with identical initials?
  • Is this what you use the id for? Or is the id simply a session variable?

    Please post the code you've tried to use, and we'll try to help, but in accordance with our charter [webmasterworld.com], we can't write it for you.

    Here are some mod_rewrite-related links to browse if you end up waiting for a reply.

    Apache mod_rewrite documentation [httpd.apache.org]
    Apache URL Rewriting Guide [httpd.apache.org]
    Regular Expressions Tutorial [etext.lib.virginia.edu]

    Jim

  • movieclub

    2:29 am on Jun 29, 2004 (gmt 0)

    10+ Year Member



    Jim,

    Thanks for the prompt reply.

    q) Are the user initials always two letters, or could there be more or less?
    a) Yes they are always two letters, they are not unique.

    q) How do you handle users with identical initials?
    a) The ID is always unique.

    q) Is this what you use the id for?
    a) The ID is always unique.

    q) Or is the id simply a session variable?
    a) No.

    The ID is a sequential number bb0003190 is the 3190'th person who joined and also has the initials bb.

    The next user id's might be ty0003191, dd0003192 and so on.

    My thought was to make the 676 directores (26 letters * 26 letters) and then use rewrite to point the users to their own directories underneath those directories. I expect it will be a long time before I have more than 32,000 of any 2 initial combination.

    Thanks for the assist. I have reviewed the rewrite documentation and it has not jumped out at me even where to start on this one.

    Bob

    jdMorgan

    4:00 am on Jun 29, 2004 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Well,

    Here's a start:

    The first thing you need is to recognize a URL with two letters a through z followed by 7 digits 0 through 9. Using regular expressions, you create a pattern to match this sequence:

    [a-z]{2}[0-9]{7}

    But you also want to create "back-references" to these parts of the requested URL, so that you can reference them in (copy them into) the substitution (new) URL. In mod_rewrite, this is done by using parentheses to group the parts and to create the back-references, designated as $1 through $9 for use in the substitution URL. Since you want the user initials to appear twice in the new URL, you must isolate those two letters to create one back-reference, and then capture the second part, the 7-digit id, into a separate group for a second back-reference:

    ([a-z]{2})([0-9]{7})

    Now, I've got to make a couple of assumptions. First, that your code is going to go into your web-root directory .htaccess file, and second, that the users' initials will always be lowercase.

    To complete the pattern, you'll probably want to anchor it. Anchoring speeds up regular-expressions processing and eliminates positional ambiguities in the pattern matching; You'll probably want to start-anchor the pattern, to make sure the requested URL *starts* with the two initials followed by the 7-digit number. This means the pattern won't match unless that is true:

    ^([a-z]{2})([0-9]{7})

    So, now all you have to do is create a RewriteRule with that pattern, and back-reference $1 twice and $2 once, with a slash between the two $1 references to create the substitution URL.

    You didn't mention it, but if the user directories contain files, then you'll need to copy those filepaths into the substitution URL as well, using $2. In that case, the pattern becomes:

    ^([a-z]{2})([0-9]{7}.*)

    Jim

    movieclub

    4:38 am on Jul 11, 2004 (gmt 0)

    10+ Year Member



    Jim,

    Sorry for the delay, I was off on other issues.

    So what your telling me is all I really need is
    The following two rewrite rules to accomplish this?

    RewriteRule ^([a-z]{2})([0-9]{7}) /home/www/$1/$1$2/
    RewriteRule ^([a-z]{2})([0-9]{7}.*) /home/www/$1/$1$2/*

    It cant be that easy.

    Bob

    jdMorgan

    6:15 am on Jul 11, 2004 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member



    Bob,

    Yes, probably just that easy -- at least, if I understand what you need to do. Try this:


    RewriteRule ^([a-z]{2})([0-9]{7}/.*)$ /home/www/$1/$1/$2 [L]
    RewriteRule ^([a-z]{2})([0-9]{7})$ /home/www/$1/$1$2/ [L]

    Test it and see.

    mod_rewrite is very powerful, partly because it uses regular expressions pattern-matching which itself is very powerful. You can do just about anything you want with it, but the key is having a precise understanding and definition of what you want to do. Otherwise, mod_rewrite will do precisely what you told it to do, which is not necessarily what you wanted. ;)

    Jim

    movieclub

    4:51 pm on Jul 13, 2004 (gmt 0)

    10+ Year Member



    The final version of the .htaccess file is:

    RewriteEngine on
    RewriteRule ^([a-z]{2})([0-9]{7}/.*)$ /home/movieclu/public_html/1users/$1/$1$2/ [L]
    RewriteRule ^([a-z]{2})([0-9]{7})$ /home/movieclu/public_html/1users/$1/$1$2/ [L]

    It works great, and thanks for the help!

    Bob