Forum Moderators: coopster

Message Too Old, No Replies

Huge MySQL Database with PHP

I have one :)

         

Skeleton

8:33 pm on Mar 3, 2005 (gmt 0)

10+ Year Member



First i don't have lots of expreince in sql statements, the worst i have less experience in performance issues in sql databases.

I have a mysql db as stated in title. it has a table and 4 fields like these:

artist VARCHAR(255)
album VARCHAR(255)
song VARCHAR(255)
lyrics LONGTEXT // :(

as you understood first three fields is not problam at all but the last field contains song lyrics. i am about to put 60.000 records to this table and it will be about 100MB in size.

on my localhost i have put around 30.000 records and doing some speed test on it. my statements are like this:

"Select artist From lyrics Group By artist Having artist Like 'a%' Order By artist Limit 0, 30"

it seems a little slow to me. i think it may change litle bit in my internet hosting cos they will probably have more stable servers running linux machines then me running mysql on a windows machine.

do you have any suggestion for me to make this more proffesional looking, stable and faster? i thought myself to seperate this last field to a brand new table and make a relation with the old one, so i can make searches via old one and map to new one. does this differs?

ok, anyway please help me :)

ironik

9:06 pm on Mar 3, 2005 (gmt 0)

10+ Year Member



You should probably model your database structure as a pseudo 'relational database'. MySQL doesn't have true relational capabilities, but it's good practice to create your tables like it does. Your structure would look something like this:

table: artists
artist_id
artist_name

table: albums
album_id
artist_id
album_name

table: songs
song_id
artist_id
album_id
lyric_id
song_name

table: lyrics
lyric_id
lyric_text

Each id field should have an index built on it and you should build your queries to take advantage of the structure:

(This statement should get the artist name and album name from the database where the artist id fields match in both tables and the artist name matches your search)
"SELECT artists.artist_name, albums.album_name FROM artists, albums WHERE artists.artists_id = albums.artist_id AND artists.artist_name LIKE 'a%' ORDER BY artists.artist_name ASC LIMIT 0, 30"

I'm not sure if it would be best practice to build an index off the artist name field... I guess it depends on how much it is being searched on. Someone else might be able to comment on that?

SeanW

10:27 pm on Mar 3, 2005 (gmt 0)

10+ Year Member



It largely depends on how the table is indexed.

Without an index on the artist column, the database has to go through every row to see if the artist field starts with a. Since you've grouped it, it has to search every row before even looking at your limit.

Someone else brought up the idea of putting your data into third normal form (ie relationships and linkages) which will certainly make things easier.

Best way to find out why a query is taking so long is to prepend the query with EXPLAIN and see how mysql thinks it will process it. If there are no indexes used, you're wasting valuable IO.

As a start, try something like

alter table lyrics add index (artist);

and see how it alters performance.

Sean

coopster

11:41 pm on Mar 3, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member




MySQL doesn't have true relational capabilities [dev.mysql.com] ...?

Sure it does. The MyISAM tables leave something to be desired if you are trying to use advanced database features such as transactional processing but that's one of the best things about the MyISAM engine, it is much faster, requires lower disk space requirements and less memory. All of which makes it very, very speedy. Most folks might only brush with transactional processing, but if you choose to do so, MySQL indeed has the capability [dev.mysql.com].

Perhaps the reference was to something else, and if I misunderstood, forgive me. Just wanted to shed some light here.


Good advice from both posts here, Skeleton. I would probably create an index on the artist name as well. But, as stated, use that EXPLAIN statement, it's going to be one of your best tools.

60,000 records will be a snap, especially after a bit of fine-tuning ;)

ironik, I like the way you normalized the tables, but I'm wondering why not incorporate the lyrics into the song table? Seems as though it "fits" better there rather than in a separate table but I may be missing something...

Skeleton

12:21 am on Mar 4, 2005 (gmt 0)

10+ Year Member



thanks guys for your replies.

before sending this thread i have already created an index for artist column after a few tests, and it extermely worked out and then i said myself "Yeah, this is it!" :) but after some more trials i wanted it to be much faster (we, humans are greedy!)

"ironik, I like the way you normalized the tables, but I'm wondering why not incorporate the lyrics into the song table? Seems as though it "fits" better there rather than in a separate table but I may be missing something... "

yeah you are right because i am just having performance problems while browsing first letters of artist. after the artist selection till to the last step of showing lyrics things are going very smooth. hence seperating tables is not a solution to my problem, more code but the same result.

i must search for 'mysql tuning'

Skeleton

12:29 am on Mar 4, 2005 (gmt 0)

10+ Year Member



i have got an idea!

how about putting a new char field just for holding the first character of the artist like 'a', 'b', 'c',....,'z'. then i may add an index to this.

you may say: "Try and see, why are you bothering us?" but it will take lots of time for me to cope with this kind of db.

does this make my artist browsing faster? if so can you supply me a sql statement for making this change? thanks

ironik

12:32 am on Mar 4, 2005 (gmt 0)

10+ Year Member



I've only scratched the surface of MySQL it seems. I was under the impression it didn't have the relational capabilities of it's bigger brothers. I stand corrected! Just adds to the reasons to keep using MySQL!

I normally put large blocks of data into seperate tables, I don't know if it's a good/bad practice and my databases thus far haven't been large enough to test whether this is faster, slower or doesn't make a difference.

In my current project I have articles sorted into categories, and then the article text is stored seperately with different translations, rollback versions etc... so I might have 1 article, but in my article text table I might have 8 different translations, or previously stored versions I can rollback to if I'm not happy with the current version I have just edited.

I wasn't aware of the EXPLAIN statement either... time to do some reading ;)

coopster

1:59 am on Mar 4, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Don't get me wrong, we are all here to help and learn and that's why I asked the question, to learn. I've done rollbacks based on date sensitive information before and it's a lot of work. Not so fun at times, but rewarding when complete. I commend you ;)

Skeleton, you don't need another column to do that...


For CHAR and VARCHAR columns, indexes can be created that use only part of a column, using col_name(length) syntax to index a prefix consisting of the first length characters of each column value. BLOB and TEXT columns also can be indexed, but a prefix length must be given.

The statement shown here creates an index using the first 10 characters of the name column:

CREATE INDEX part_of_name ON customer (name(10));

Because most names usually differ in the first 10 characters, this index should not be much slower than an index created from the entire name column. Also, using partial columns for indexes can make the index file much smaller, which could save a lot of disk space and might also speed up INSERT operations!

Resource:
[dev.mysql.com...]

SeanW

2:03 am on Mar 4, 2005 (gmt 0)

10+ Year Member




how about putting a new char field just for holding the first character of the artist like 'a', 'b', 'c',....,'z'. then i may add an index to this.

Please don't do that :) It's redundant, and the index already handles it behind the scenes. Use the EXPLAIN command and you should see that it's being used.

For example, from a similar table (urlcomponent is varchar(255) with an index)


mysql> EXPLAIN SELECT author FROM item WHERE urlcomponent like 'a%' \G;
*************************** 1. row ***************************
table: item
type: range
possible_keys: urlcomponent
key: urlcomponent
key_len: 255
ref: NULL
rows: 3106
Extra: where used
1 row in set (0.00 sec)

Even though I only asked for stuff beginning with a, mysql knew to use the index.

Sean

ergophobe

7:23 pm on Mar 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think ironik means to say that MySQL does not allow you to declare relations that enforce referential integrity. That doesn't mean the DB is not relational, it means it is missing one feature commonly associated with relational DBs

SeanW

8:21 pm on Mar 4, 2005 (gmt 0)

10+ Year Member



Foreign keys are possible on InnoDB tables.

[dev.mysql.com...]

Sean

ergophobe

9:36 pm on Mar 4, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



So they are. I have to say that I've just stuck with the MyISAM tables. I've thought about changing, but haven't felt the need.