Welcome to WebmasterWorld Guest from 35.172.111.215

Forum Moderators: open

Message Too Old, No Replies

MySQL: thoughts on setting up the structure for PMs

     
2:24 am on Oct 17, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month

joined:Mar 15, 2013
posts: 1205
votes: 120


I originally set up this structure about 8 years ago, moving PMs from flat-text to MySQL. That setup had 2 rows for every PM sent:

id
username
sender
folder
status ('New' or null)
postdate
subject
message

So if "Joe" sent a message to "Angie" then it would insert like this:

INSERT INTO pm VALUES
(NULL, 'Joe', 'Angie', 'sent', NULL, 20181018213400, 'Test subject', 'Test message'),
(NULL, 'Angie', 'Joe', 'inbox', 'New', 20181018213400, 'Test subject', 'Test message')


The reason for doing 2 rows like that was because, every once in awhile, I'll have a suspected scammer or spam bot send a bunch of messages that I would want to remove from the Inbox, but wanted to leave in the "sent" folder until I was sure.

There's also another table for "usernames", of course.

After some discussion on here, I'm trying to change this system up a bit. I want to group messages between two users together, but still give both people the ability to create a new thread (Gmail style), as well as be able to move individual messages to different folders.

So the new structure will be to have two tables: "pm" and "pm_data". Table "pm" will have:

id (Auto Increment)
refid (reference id to pm_data)
username
sender
folder
status (0 or 1)
postdate
subject

Then "pm_data" will have:

refid (Auto Increment)
hash (UNIQUE, an MD5 of message)
message

Then when they view the message, the query would look like this:

SELECT * FROM pm 
JOIN pm_data ON pm.id = pm_data.refid
WHERE username='%s' AND sender='%s' AND subject='%s'
ORDER BY postdate DESC


This method should cut the database size in about half, because there would only be one copy of the message instead of 2. Then the indexes should be faster, and in theory the whole system should load faster.

So. Beyond that, I'm considering 2 other modifications:

1. Move "subject" to a third table, and instead use a "group_id" column. I don't think I would need to make it UNIQUE like I do "pm_data", but then "pm" would have an INT column to reference the subject instead of having the subject in each row.

2. Someone else suggested giving each username an ID#, too, and then in "pm" I would store the numbers instead of the names.

If I did that then the query would look something like this:

SELECT * FROM pm 
JOIN pm_data ON pm.id = pm_data.refid
JOIN pm_subject ON pm.subjectid = pm_subject.group_id
JOIN users_id ####
WHERE pm.userid='%s' AND pm.senderid='%s' AND pm.subjectid='%s'
ORDER BY postdate DESC


I'm not quite sure how to join the users_id table there, by the way, because I would need to get the username for both pm.userid and pm.senderid...

So the question here is, do you guys think that doing 3 JOINs like that would be inevitably faster than just the one? It would be marginally harder to maintain, but if it makes the index smaller and each page load faster then it could / should result in more pageviews per session, which would make it worthwhile.
8:13 am on Oct 17, 2018 (gmt 0)

Preferred Member

Top Contributors Of The Month

joined:Sept 13, 2018
posts:355
votes: 71


2. Someone else suggested giving each username an ID#, too, and then in "pm" I would store the numbers instead of the names.

You should because:

- numeric lookup is much faster than string lookup
- numeric ID takes less space than string (obvious)
- if username can be changed, the user numeric IdDremains the same.
5:32 pm on Oct 17, 2018 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Dec 15, 2003
posts:2645
votes: 7


If it were me, I would structure my tables so that you can have more than 2 participants in a PM, and so that each "message" can have many posts.

Take a look at this basic structure. When getting messages for a user, you select from tblMessageHasParticipants where user_id = [id]

When the user views a message you select from tblMessageHasPost where message_id = [id of message thread they clicked on]. Sort it based on id to get a "thread" of posts and responses.

This will give you a good base structure and will allow for features such as (post "liked") or (post "deleted"). User's could also "leave" a message thread leaving others to continue a discussion. It will also give you a lot more flexibility for things like the user's having the message live different folders and/or have different statuses without duplicating data, as your structure requires.

tblMessage
********
id (auto inc)
subject
created_by (ref user.id)
created_at (datetime)
updated_at (datetime)

tblMessageHasParticipants
*******
(complex primary key message_id + user_id)
message_id (ref tblMessage.id)
user_id (ref user.id)
folder
status
added_at (datetime)
left_at (datetime)
active (boolean)

tblMessageHasPost
********
id (auto inc)
message_id (ref tblMessage.id)
posted_by (ref user.id)
post (text)
posted_at (datetime)