Looking for some 3rd party input on a performance issue that has crept up with our website recently. I'm afraid I'd rather not divulge our website address at the current time, but I've included quite a few facts below that I'm hoping will trigger some ideas we haven't thought of yet. Here's the scoop:
- The site's hosted in a popular Texas based Cloud server. (Shared hosting, shared database) (for the purpose of this post I will ******** the name company name of the server.)
- The site is built in PHP (using Smarty templates) & MySQL.
- In early October, Google's "time spent downloading" in Google Webmaster tools has been increasing steadily. Previously, the site was behaving fairly consistently for roughly 10 months. Lately, some of our products in Google's merchant center have been "delisted" because Google timed out getting to them as well.
- No programming changes were made in the weeks/months prior to the slowdown.
- Another symptom we experience is an occasional "lag", whereby the page will half-load, sit there for 15+ seconds and then continue to load. It tends to happen every 15-20 page loads.
- "Lagged" pages always take about 30 seconds to load - regardless of the type of page being looked at. (Typical page loads are 5 seconds for category & product pages, 12 seconds for homepage.)
- Lag always starts at the same spot. Timewise, it's roughly equivalent to where the page would normally be done if watching the page load progress with a monitoring tool. Location-wise, it's at about 1/4 of the page-load. After the lag, the remainder of the page loads very quickly.
- We did quite a lot of back/forth with ********* since nothing had changed prior to the performance degradation so the server(s) seemed to be the likely culprit. They indicated no issues with the web or database servers - including providing stats such as highest usage % and wait i/o during the day which were negligible.
- Because of the way it was originally built, each page load does *a lot* of queries. Hundreds in some cases. They're simple queries, but a ton of them. Because of this, we always start by assuming it's a "too many queries" or "locking" problems.
- We setup a dedicated database server to ensure that we have *all* the power of the database server to ourselves, but performs about the same and has the same lag issues.
We've done the following with regards to database queries & indexing:
- All tables are InnoDB to mitigate potential table locking.
- MySQL slow query log (with 10s timeout) on the new server does not show any slow queries. (But it does experience the 15+ second lags)
- MySQL slow query log on the shared server shows a few slow queries (eg. 12 seconds), but oddly not those that would be expected to be slow. (eg. A query that pulls 1 record from a table with only 1 record, or a query that pulls a record from a table of 5 records based on its primary key). There does not appear to be any lock-time associated with any slow queries.
- The vast majority of queries are well optimized (including indexes that include 'order by' criteria in the indexes to avoid filesorts) and those that are complex and slower occur only on one or two pages that aren't necessarily those that indicate locking behavior.
We're currently looking into non-database-specific, inefficient code, but the site's been running well for ages so it seems strange that a programming issue would pop up out of the blue.
Any thoughts as to where we might've left a stone unturned?