Forum Moderators: open

Message Too Old, No Replies

Temp Tables vs Consecutive Selects

         

Frank_Rizzo

10:39 pm on Mar 3, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I use temp tables to temporarily hold data whilst calculations are performed on it using php.

"create temporary table $temp_table type=heap select * from bigtable where $where_clause";

This can extract, say, 20,000 records out of 1,000,000 and thus it is now possible to perform calcs on this smaller dataset:

"select min(xyzdate),max(xyzdate) from $temp_table"

code ...

"select * from $temp_table";

code ...

"select count(distinct state) from $temp_table");

code ...

"select sum(if(abc=1,1,0)),count(*),sum(1/(nnn+1)),sum(if(abc<>1 and hhh=1,xyz,0)) from $temp_table where widget = 'blue'"

code ...

This works great and is quite fast but one draw back is that a lot of temporary tables are created (in heap not disk). This has not been a problem so far but the table is increasing in size and the size of the heap files is increasing.

I have used this method for a few years but I wondering if newer techniques in 5 or 6 would be better?

Just to summarise:

1. Need to extract small dataset from 1,000,000 records. Dataset can be from 10 to, say, 100,000 records in size.

2. Different selects need to be performed on just that dataset. There are anything from 20 to 40 selects which are performed.

What is not efficient is this:

"select min(xyzdate),max(xyzdate) from bigtable where $where_clause"

code ...

"select * from bigtable where $where_clause"

code ...

"select count(distinct state) from bigtable where $where_clause"

code ...

"select sum(if(abc=1,1,0)),count(*),sum(1/(nnn+1)),sum(if(abc<>1 and hhh=1,xyz,0)) from bigtable where $where_clause and widget='blue'"

code ...

Frank_Rizzo

1:30 pm on Mar 4, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hmm. I just did a few tests and it seems that the repeating "from bigtable where $where_clause" commands is not as slow as I had thought it was.

I can now run scripts the same or faster than the temp table method.

The use of temp tables is something I have done for years - started with early mysql4. I guess since then mysql is so much better at caching.