Forum Moderators: coopster

Message Too Old, No Replies

Increase PHP memory or do multiple MySQL queries?

         

csdude55

9:11 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



My server is pretty much red-lining all the time, and I'm pushing off upgrading for as long as I can. With ad revenue getting lower and lower every month, I honestly just can't afford it.

I have one section that does 2 MySQL queries every page of the section. The first query does a COUNT to find the total results for the section (which is around 200,000), and the second query does the same query but instead of a COUNT it actually selects the tables. I do the second query in increments of 20, though, so the results are always 20 rows.

As I'm transitioning to infinite scroll, though, I have the potential of doing a lot more queries in a short period, so I'm trying to cut it all down and save as much stress on the server as I can. I decided to do a single query that returns all 200,000 rows then push it to an array, so the page would only do a single query regardless of how many pages they scroll through; eg:

$arr = mysqli_fetch_all($sth);
$count = count($arr);

But now I'm getting a fatal PHP error that I'm running out of memory on the mysql_fetch_all() line:

PHP Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 72 bytes) in ...

The bottleneck isn't the query itself, phpMyAdmin shows the query completes in 0.0032s. But even though it's only returning 200,000 rows, they combine to about 52MB, and I guess that's a lot :-(

** I'm confused that the error shows that I'm breaching 67M, though (memory_limit is set to 64M). My query does a single LEFT JOIN; tableA is 41.6MB, and tableB is 11MB. The query looks like:

SELECT tableA.id, tableB.which, updated, subject, firstname, lastname, size, intro
FROM tableA
LEFT JOIN tableB
ON tableA.id = tableB.id
WHERE (tableB.which = '%s' OR tableB.which = '%s')
AND tableA.id > 12345
ORDER BY tableA.updated DESC

The results of EXPLAIN:

TABLE | TYPE | POSSIBLE_KEYS | KEY | KEY_LEN | REF | ROWS | EXTRA
tableA | ALL | PRIMARY | NULL | NULL | NULL | 182746 | Using where; Using filesort
tableB | ref | Unique,which | Unique | 3 | myDB.tableA.id | 1932 | Using where; Using index

I don't know of any way to make the query "better", especially since it returns in 0.0032s. But I'm confused about the memory size, too, so if you guys and gals see something that I'm missing... ?

Otherwise.

I guess my options here are to either change the PHP configuration to allow more memory, or to go back to the original style with multiple MySQL queries on the page.

Which would you do?

robzilla

9:48 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Fetching the full set of records with each request seems highly inefficient. 200,000 rows is a lot of data, and most of it will be wasted.

The queries don't look particularly complicated or personalized. Why don't you employ a memory cache? (I sometimes use simple txt or json files too). You could easily keep a count of the table rows in a simple key-value pair, and update that whenever a row is added or removed. That's one query down. The other query could also be cacheable, but it depends on how often the data changes (i.e. how often the cache would be purged). Worth investigating, I'd say.

csdude55

10:15 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As you can probably tell from the columns I'm selecting, this is for a pretty active message board. The actual quantity shown can change based on the variable sent to tableB.which, and I also use the same page for search results and "favorites". I could set up a text file for each variable for tableB.which (there are about 100 possibilities), but I'd still have to do a query to count the other two. And I'm not sure if it's faster for the system to open, read, and close a text file than to perform a second query.

I could realistically code a query for the count, and then limit the results query to, say, 100. I suspect that MOST people would only look at a few pages worth at a time, so 100 results would be enough for the majority of people. Then I could just have future queries in increments of 100. But that's still 2 queries for each page load, so it doesn't really help my immediate overload problem.

Regardless, I think you're saying that reading all of the data and pushing it to a single array is not a good option, regardless of the memory I allocate?

csdude55

10:29 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't suppose there's a way in PHP to get a specific number of results of a query? Like, only read rows 0-19, or 60-79, without pushing the whole set to an array?

Dimitri

10:52 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member Top Contributors Of The Month



I'm confused about the memory size, too

It's not just the size of the data, but the amount of memory PHP uses to effectively store these data (PHP has to keep extra information about each rows, and fields).

Like, only read rows 0-19, or 60-79

I guess I am misunderstanding the question. But just in case, in an SQL query, you can use the LIMIT to select a range.

csdude55

11:02 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I guess I am misunderstanding the question. But just in case, in an SQL query, you can use the LIMIT to select a range.

Since the error is at mysqli_fetch_all() and the query itself is completely pretty quickly, I was thinking... instead of pushing all 200,000 rows to an array when 199,900 of them are ignored, would there be a PHP way to take the result of 200,000 and only push 100 to an array.

So still do 1 query and get all of the results with no LIMIT, but only push X number to the array.

Something like:

$i = 0;

while ($row = mysqli_fetch_row($sth)) {
$arr[] = $row;
$i++;
if ($i == 100) break;
}

$count = mysqli_num_rows($sth);

Well, I guess that would actually work, unless there's a built-in PHP command to do it better.

And if I did it that way, would it use less memory and/or be a better option?

robzilla

11:34 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Soon as you run mysqli_query(), the 200,000 rows will be put into memory for the PHP worker. There's no getting around that. Maybe you can handle that if you have ample memory available on the server, but on a busy site it's still an awful lot of data traveling around; which could put additional strain on your CPU and hard drive. If it's already red-lining, this could potentially choke it.

Have you confirmed that these simple queries are, in fact, significantly contributing to server load? Is this the lowest-hanging fruit?

csdude55

11:55 pm on Mar 17, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It's really just 1 issue out of a million, honestly, so I'm chipping away whatever I can.

As of 7:48pm EST, my server load is at 2.27. I have two CPUs so that's slightly overloaded, but it gets up to 5 and 6 pretty regularly. And right now I see that MySQL is 38% of the CPU and 61.91% of the memory. I optimize MySQL every so often, but it's still always the biggest resource hog.

I've set up a slow log, but it's pretty rare that any query is saved there. The slow queries that are usually there come from a different section that I haven't started rebuilding yet (it's gonna be super complicated), so I don't think that this is the MAIN problem. It's just somewhere that I was hoping I could save a query while I'm working on it.

robzilla

12:12 am on Mar 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Xdebug can tell you exactly what's happening and where, it's a lifesaver. Or try a third-party solution like Blackfire.io (free trial; no affiliation).

Alternatively, for a rougher view of performance, pull up the Site Speed stats in Google Analytics, if that's what you use, and have a look at the Avg. Server Response Time for your most popular pages. If any jump out, that's a good place to start investigating.

Skipping MySQL altogether with a memory cache sometimes works wonders for me. I primarily use APCu.

If MySQL is using only 38% of CPU, how can it be the biggest resource hog? :-) But that's assuming this is a basic LAMP/LEMP stack.

brotherhood of LAN

1:42 am on Mar 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The bottleneck isn't the query itself, phpMyAdmin shows the query completes in 0.0032s.

That sounds like the results were cached hence the very quick return time

** I'm confused that the error shows that I'm breaching 67M, though (memory_limit is set to 64M). My query does a single LEFT JOIN; tableA is 41.6MB, and tableB is 11MB. The query looks like:

PHP is giving you the error, and data returned is in string form rather than the compact binary form that MySQL stores it in, so the data being sent to PHP will be larger than the tables.

Have a look at MYSQLI_USE_RESULT. If you're not using it, what happens is that your entire 200,000 rows get buffered into PHPs memory as Robzilla said. If you use it, you consume more or less the size of one result row. Not clear whether you're able to use it for your needs.

lammert

2:04 pm on Mar 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As far as I know, MYSQLI_USE_RESULT will solve the problem of memory usage, but it will block subsequent queries until you have fetched all records in a PHP loop. That loop will be less efficient in fetching all data than the internal compiled loop in the PHP core, resulting in more CPU cycles needed for the same output compared with the default MYSQLI_STORE_RESULT.

tangor

5:49 pm on Mar 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



ALL is ... er ... ALL and that will create a load.

Are you sure you don't have other query options to develop the results desired?

At some point ALL is actually TOO MUCH.

(break by Date or Item or or or or ...)

Refine it before the request. Your CPU and HD will thank you.

csdude55

7:47 pm on Mar 18, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I tried a few variations using MYSQLI_USE_RESULT, but nothing seemed to solve my problem :-( I finally ended up close to where I started... I'm doing one COUNT query, and then inside of the infinite scroll I'm doing a query that limits to 100 results. My infinite scroll iterates after 20 results, so this is really just two queries on the first load and then one query every 5th page load/scroll.

So no real improvement from the original, but at least the number of queries from the infinite scroll shouldn't be any worse (famous last words...).

brotherhood of LAN

7:28 am on Mar 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



What kind of param are you passing from client to server when infinite scroll is invoked?

The less optimal way is ?page=1, ?page=2 etc and your MySQL query taking that number and mulitplying it by 20. As you get further down the list you're doing something like LIMIT 400,20 so MySQL has to read 400 results and then discard them to get the subsequent 20.

If the results are coming out in an order that is already a key/index, it's better to use that and pass it client side, e.g. primary key (descending?) order. You pass the last id in the last to the client side which passes it back for infinite scroll, e.g. id < 19980 ORDER BY id DESC LIMIT 20 for 1st scroll, id < 19960 ORDER BY id DESC LIMIT 20 etc. This should be very quick.

Also consider if your initial COUNT(*) query is essential on every page load. You could do it once every 10 minutes and write it to file. If you access it often it'll likely be cached in memory most of the time. Just some ideas.

Worth thinking about how many possible results can be created by your query, how it fits into MySQL's query cache, if MySQL will cache the results, if any updates would continually invalidate the cache etc.

csdude55

5:02 pm on Mar 19, 2020 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



That's a great idea, @brotherhood. I'm currently sending a page number, but there's no reason that I can't send an ID or timestamp. That would probably make search engines happier, too. I'm not 100% sure that they'll pick up on /link/?page=2, but assuming that they do then it would always be changing. But /link/?timestamp=20200319124314 wouldn't really change as much.

Thanks for the tip :-)

Also consider if your initial COUNT(*) query is essential on every page load. You could do it once every 10 minutes and write it to file. If you access it often it'll likely be cached in memory most of the time.

Good though, but I'm afraid it wouldn't work out so well for me :-( I have 8 checkbox filters that the user can apply, so I think that's around 40,000 possible combinations... and I might add more filters :-O And it would have to be updated every time someone adds a new thread (which is sometimes several times per minute), so I think it would end up being more trouble than it's worth.