Forum Moderators: open
Lets say I have a Teams table with, TeamName, Mascot, Wins, Losses. I include an ID at the beginning so I can link on a team by their ID. Should I not include this and just link using the TeamName? I do this sort of thing all over the place. Then I even have certain linking tables that pretty much contain all ID's... like TeamsConferences has ID, TeamID, ConferenceID.
I'd love to hear your thoughts on this kind of setup. And how performance might be affected by going either way here.
Using that ID as the foreign key (to link tables, etc.) is OK also, but a bit more dangerous. If ever you need to reformat that table, it's possible that those keys could indeed change, rendering your joins a mess.
I think the best solution is to still use the ID as the unique key in the table, but to create another id number for that item throughout your database.
A bit more programming, but safer in the long run.
CREATE TABLE `your_tablename` (
`id` int(10) NOT NULL auto_increment,
`username` varchar(40) default NULL,
`password` varchar(40) default NULL,
`lastupdate` timestamp(14) NOT NULL,
`foreign_key` int(6) default NULL,
PRIMARY KEY (`id`)
)
... so the id field is autoincrementing and you can have as many as you like. The foreign_key field is used to link into a related record in another table.
I think the best solution is to still use the ID as the unique key in the table, but to create another id number for that item throughout your database.
You mean have like
autoid,id,Team,...
34234,5,My team,...
Where autoid is auto-increment and id is say created by your script when the team is entered?
Under what conditions do the auto increment field values change when using, say MySQL?
Where they would change is if your MYSQL dumps look like this for the table above
INSERT INTO `your_tablemname` (`id`, `username`, `password`, `lastupdate`, `foreign_key`) VALUES (NULL, 'me', 'mypass', NOW(), 'fk');
OR
INSERT INTO `your_tablemname` (`username`, `password`, `lastupdate`, `foreign_key`) VALUES ('me', 'mypass', NOW(), 'fk');
In this case you will get all new consecutive autoincrement IDs. If you do full inserts like
INSERT INTO `your_tablemname` (`id`, `username`, `password`, `lastupdate`, `foreign_key`) VALUES ('21', 'me', 'mypass', NOW(), 'fk');
Your autmo-increment value will stay the same.
Then, within the data, I have different codes to identify users. So if one user creates four "teams", I would also write the code to give each team a unique ID number, which would be part of the data, NOT dependent on the row.
Yes, it's a bit more work. But it serves two purposes, first, it gives a data-centric ID, which serves as the identified for that TEAM throughout the database, regadless of the row it's on. And TWO - it keeps the numbers a reasonable length.
I kind of regret using the RRN for entry ID now. In one table I'm up over 100,000 entries, so the resulting "entry number" is indeed in 6 figures. So for another 6 years I'll be OK with a 6 digit number, but it didn't have to be that way, if I made the number unique for the USER, not the DATABASE.
Hope that clears it up a bit. All you said was true, it is a bit redundant, and does require extra coding, but if ever I need to reformat the tables the DATA identifier will be there while the ROW identifier might change.
Also, not sure I see the benefit of having a RID too? I mean shouldn't the DID be enough? I still don't see a reason I'd ever use a RID if I had a DID (which is unique to each row right).
Thanks.
If your table had 10 columns, and you didn't have a unique RID, you would have to use all 10 fields to get a unique row (if you could) to update/delete that row.
If you had a unique ID for the Row, you could just reference that.
To make a unique DID, you would have to check the MAX(DID) and add 1.
The other advantage to do ing that is that the identifiers for each team seem to make sense. Why would a new team start out with team ID 646?
Still, for all my pontificating, I use both. it's more work to do the unique DID, I think it's better programming, but when I don't have much time I rely on the unique RID.
On MSSQL you have the option of turning IDENTITY INSERT on and off which helps when importing data.
You can also add a compurted column to the table to create a more user friendly key, generated from the system key e.g (T-SQL)
[Sales ID] AS (right((replicate('0',6) + convert(varchar(6),[ID])),6))
The main arguments against that I have heard are for distributed servers where maintaining an increment over several servers is problematic.
The main benefit as stated by txt is it removes the need for composite keys to locate a single record. But you should still make those keys as unique constraints, just not primary keys.
I dont really see the reformat problem. How can you delete records when theres a foreign key constraint?
build in some sort of 'rebuild' function into my script that will rebuild the relationships between the tables with the new unique ids should a situation arise where the ids got messed up.
Devil's advocate asks: how are you going to rebuild your relationships if your ids get messed up? (sounds like a question about identity theft!)
it's a shame really that it's possible for auto-increment fields to get changed on a table reformat.
Ahh, but what would you say if it were impossible to change auto-inc values? That would be a shame as well.
how do you go about generating a DID?
Is this going to be merely a foreign key or also a reference number (i.e. "your reservation number is" doesn't really want a 64 character string)? In PHP you can use a timestamp, the uniqid() function with added entropy and some hash function and virtually guarantee a unique id. Of course you still have to check, so it's a query to the server either way (i.e. check for max value and guarantee uniqueness or check for uniqueness and possibly regenerate).
The main arguments against that I have heard are for distributed servers where maintaining an increment over several servers is problematic.
That's the only time I've bothered myself - when you need to ensure uniqueness across different machines. E.g. I had an Access database where independent replicas got synchronized and I wanted the ids to stay the same, so I avoided collisions by essentially using an id that included a reference to the data enterer so say I could never end up inputting a record with the same number as a number that you input, because the last digits always keyed to me.
I dont really see the reformat problem. How can you delete records when theres a foreign key constraint?
I don't really see the reformat problem either. Or rather, I don't see how having a second ID helps you there. To answer the question though, MySQL does not enforce referential integrity (yet), so you can delete anything anywhere.
Which means a little more work for a programmer. I'm in with the RID group. I only use them to update or delete a row, knowing that no other row was affected. There is no relationship between any tables using auto-index. Enforcing data relationships is my job. Processing rows is the job of MySQL.
Enforcing data relationships is my job.
That's sort of like comparing programming languages and saying "type juggling is my job". It is in C, but it isn't in Scheme. The folks who write low-level embedded software naturally have no interest in Scheme and want to type-juggle. The AI crowd want the computer to do as much as possible so they can focus on a higher level of abstraction.
The only RDMS I've worked with is MySQL, so I've always had to enforce referential integrity myself, but it's only my job because I happen to work with a given platform.
Honestly, I think all that should be the RDMS's job, not the programmer's. But then again, I like Scheme. So there you have it.
Certainly would be a lot easier to issue one delete command rather than three.
Not so. If I delete an Invoice on my databas it automatically deletes all the invoice lines for me. But if I try to delete a Product it moans because there are related sales and I dont allow cascades on that table.
If a database doesnt support foreign key constraints, cascades and triggers then its not much better than a multi-user spreadsheet is it?
Honestly, I think all that should be the RDMS's job, not the programmer's.