Forum Moderators: coopster

Message Too Old, No Replies

how random is rand()?

         

neophyte

11:56 am on Sep 10, 2007 (gmt 0)

10+ Year Member



Hello All -

I'm generating a six-digit random number (using mt_rand()) to be used in a db tables primary key column for my table records. Six digits would certainly seem long enough to ensure that no two primary keys would be the same, but depending upon the number of records in that table, I'm starting to have my doubts.

What do the good folks here say about this?

Is a six digit mt_rand() number random enough? Or, once the number is generated, should I query the table and compare the number just generated against the existing PKs to ensure uniqueness before inserting a new record into the table?

Neophyte

vincevincevince

12:01 pm on Sep 10, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Why use rand() for a key when you can use auto_increment? Not only do auto_increment-ed keys avoid the possibility of duplication, they are more efficiently searched.

neophyte

11:40 pm on Sep 10, 2007 (gmt 0)

10+ Year Member



vincevincevince -

Thanks for you input. The client specification for this project dictates that a random and unique six-digit number be generated for each record which would be used as a "client ID number" - this is why I'm not using the auto-increment feature.

Neophyte

cameraman

12:10 am on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Instead of doing a separate query, I think I'd generate the number, try the insert, and check to see if the insert failed because of duplicate key - if so, generate another one (or add 1 to the one you generated) & try again.

vincevincevince

2:43 am on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The client specification for this project dictates that a random and unique six-digit number be generated for each record which would be used as a "client ID number"

I'd still use auto_increment for the real client_id number for the index benefits - then have another field for 'public-facing ID'. cameraman's idea of checking if the INSERT worked is a simple solution, good thinking there. If speed is going to be a major problem, generate your random number list in advance and store it in a table `random_numbers` with fields `number_id` and `random_number` - that way any auto_increment field of another table can be translated into a unique random_number easily.

neophyte

7:48 am on Sep 11, 2007 (gmt 0)

10+ Year Member



Thank you both for your help!

Neophyte

henry0

1:02 pm on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



vincevincevince

generate your random number list in advance and store it in a table `random_numbers` with fields `number_id` and `random_number` - that way any auto_increment field of another table can be translated into a unique random_number easily.

This a good idea!
How do you foresee its implementation
will you grab the top one, insert it where it's needed and del it from the rand repository?

jatar_k

1:06 pm on Sep 11, 2007 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



or you can have a column for 'used' and select id from table where somecol!='used'

henry0

1:21 pm on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That's doable too
so you will insert the rand twice once as an ID and once in "used"
I suppose it should be less MySQL work.

vincevincevince

1:25 pm on Sep 11, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How do you foresee its implementation

Either as you said (delete the first row after using it) - which would seem to be the easiest but requiring an extra SELECT and DELETE with each assignment... or as a direct mapping.

Direct mapping would mean:
main table named `clients` contains `rowid` (auto_increment)

To get client_id:
SELECT * FROM `clients`,`random_numbers` WHERE `random_numbers`.`rowid`=`clients`.`rowid` AND ....

i.e. don't actually store the random number in the `clients` table at all but 'translate' from the auto_incremented number to the respective random number as required.

This shifts the database load to the output/display end rather than the assignment/data end and so keeps your `clients` table itself with a good efficient index which can be used internally for joins to other client-related tables (rather than using the random number for the join).