|get count and results?|
mysql, php, row count
| 1:49 am on Dec 16, 2009 (gmt 0)|
I'm working on a web app that was working fine with few rows, but is now falling apart with a lot of data.
The main problem is that the developer wrote it so that he executes a large and gnarly multi-joined query against a table once, grabbing just the count of the rows in the query-- then executes it again with a limit, so he can pull in the paginated results to the app. This works fine with a couple hundred rows in the table, but with a couple hundred thousand or a million it fails. Optimizing the query has helped, but you're still basically running the big query twice every time.
I was thinking of using something like the php mysql_num_rows function, but obviously this won't work if there's a limit set (or it will work, but it will only show the number of rows returned, which will be equal to the limit, not the actual number of row the "naked" query would have returned).
Is there any technique or function I may not be aware of that will work in this situation? maybe create a function on the DB side that returns the count or something?
Thanks for any thoughts!
| 8:58 pm on Dec 16, 2009 (gmt 0)|
|I'm working on a web app that was working fine with few rows, but is now falling apart with a lot of data. |
Heheh, me too. Except I'm the developer, so I'm always yelling at myself. ;)
|Optimizing the query has helped, but you're still basically running the big query twice every time. |
This is a pretty common problem. The only way I know to deal with these is through query optimization and then aggressive caching of complicated join results.
In a recent case, the joins were against a few tables each with >1 million rows. Caching is the only way to really improve performance, but the first user with any particular query still gets the hit.
A possible alternative is to simply always cap the results at something reasonable (perhaps 1,000, depending on your needs) and limit the COUNT() query against that. If it's ever over 1,000 then just tell the user "only the first 1,000 results are shown." This may also help to prevent runaway spiders from deep crawling millions of paginated result pages.
| 9:23 pm on Dec 16, 2009 (gmt 0)|
Hi Gregg-- thanks for the input. I think you're right with both the caching and the hard limit on results. I think that might be the route I go, since no one is likely to page through 5000+ results manually without refining their search!