homepage Welcome to WebmasterWorld Guest from 184.73.87.85
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Pubcon Website
Home / Forums Index / Code, Content, and Presentation / Databases
Forum Library, Charter, Moderators: physics

Databases Forum

    
Simple mysql select query very slow
mysql select query very slow
ferly_man




msg:4201329
 7:22 pm on Sep 13, 2010 (gmt 0)

First time posting on here...

Okay, I was hoping to get some custom feedback for my specific mysql query, and ultimately, how to speed it up. The query in question is currently taking on average about 8 seconds to execute.

The table in question has only 702,750 records. No current indexes (which I am hoping someone can help me with), and has the following my.cnf config:

[size=3]auto increment increment1
auto increment offset1
automatic sp privilegesON
back log50
basedir/
binlog cache size32,768
bulk insert buffer size8,388,608
character set clientutf8
character set connectionutf8
character set databaseutf8
character set filesystembinary
character set resultsutf8
character set serverutf8
character set systemutf8
character sets dir/usr/share/mysql/charsets/
collation connectionutf8_general_ci
(Global value)utf8_unicode_ci
collation databaseutf8_unicode_ci
collation serverutf8_unicode_ci
completion type0
concurrent insert1
connect timeout20
datadir/var/lib/mysql/
date format%Y-%m-%d
datetime format%Y-%m-%d %H:%i:%s
default week format0
delay key writeON
delayed insert limit100
delayed insert timeout40
delayed queue size1,000
div precision increment4
keep files on createOFF
engine condition pushdownOFF
expire logs days0
flushOFF
flush time0
ft boolean syntax+ -><()~*:""&|
ft max word len84
ft min word len4
ft query expansion limit20
ft stopword file(built-in)
group concat max len1,024
have archiveYES
have bdbNO
have blackhole engineYES
have compressYES
have community featuresNO
have profilingNO
have cryptYES
have csvYES
have dynamic loadingYES
have example engineYES
have federated engineYES
have geometryYES
have innodbYES
have isamNO
have merge engineYES
have ndbclusterNO
have opensslNO
have sslNO
have query cacheYES
have raidNO
have rtree keysYES
have symlinkYES
hostname******************
init connect
init file
init slave
innodb additional mem pool size1,048,576
innodb autoextend increment8
innodb buffer pool awe mem mb0
innodb buffer pool size8,388,608
innodb checksumsON
innodb commit concurrency0
innodb concurrency tickets500
innodb data file pathibdata1:10M:autoextend
innodb data home dir
innodb adaptive hash indexON
innodb doublewriteON
innodb fast shutdown1
innodb file io threads4
innodb file per tableOFF
innodb flush log at trx commit1
innodb flush method
innodb force recovery0
innodb lock wait timeout50
innodb locks unsafe for binlogOFF
innodb log arch dir
innodb log archiveOFF
innodb log buffer size1,048,576
innodb log file size5,242,880
innodb log files in group2
innodb log group home dir./
innodb max dirty pages pct90
innodb max purge lag0
innodb mirrored log groups1
innodb open files300
innodb rollback on timeoutOFF
innodb support xaON
innodb sync spin loops20
innodb table locksON
innodb thread concurrency8
innodb thread sleep delay10,000
innodb use legacy cardinality algorithmON
interactive timeout10
join buffer size1,048,576
key buffer size104,857,600
key cache age threshold300
key cache block size1,024
key cache division limit100
language/usr/share/mysql/english/
large files supportON
large page size0
large pagesOFF
lc time namesen_US
licenseGPL
local infileON
locked in memoryOFF
logOFF
log binOFF
log bin trust function creatorsOFF
log error
log queries not using indexesOFF
log slave updatesOFF
log slow queriesOFF
log warnings1
long query time10
low priority updatesOFF
lower case file systemOFF
lower case table names0
max allowed packet419,430,400
max binlog cache size18446744073709547520
max binlog size1,073,741,824
max connect errors20
max connections100
max delayed threads20
max error count64
max heap table size33,554,432
max insert delayed threads20
max join size18446744073709551615
max length for sort data1,024
max prepared stmt count16,382
max relay log size0
max seeks for key18446744073709551615
max sort length1,024
max sp recursion depth0
max tmp tables32
max user connections0
max write lock count18446744073709551615
multi range count256
myisam data pointer size6
myisam max sort file size9223372036853727232
myisam mmap size18446744073709551615
myisam recover optionsOFF
myisam repair threads1
myisam sort buffer size16,777,216
myisam stats methodnulls_unequal
net buffer length16,384
net read timeout30
net retry count10
net write timeout60
newOFF
old passwordsOFF
open files limit2,158
optimizer prune level1
optimizer search depth62
pid file/var/lib/mysql/***************.pid //omitted website name
plugin dir
port3,306
preload buffer size32,768
protocol version10
query alloc block size8,192
query cache limit104,857,600
query cache min res unit4,096
query cache size104,857,600
query cache typeON
query cache wlock invalidateOFF
query prealloc size8,192
range alloc block size4,096
read buffer size4,194,304
read onlyOFF
read rnd buffer size4,194,304
relay log
relay log index
relay log info filerelay-log.info
relay log purgeON
relay log space limit0
rpl recovery rank0
secure authOFF
secure file priv
server id0
skip external lockingON
skip networkingOFF
skip show databaseOFF
slave compressed protocolOFF
slave load tmpdir/tmp/
slave net timeout3,600
slave skip errorsOFF
slave transaction retries10
slow launch time2
socket/var/lib/mysql/mysql.sock
sort buffer size4,194,304
sql big selectsON
sql mode
sql notesON
sql warningsOFF
ssl ca
ssl capath
ssl cert
ssl cipher
ssl key
storage engineMyISAM
sync binlog0
sync frmON
system time zoneCDT
table cache1,024
table lock wait timeout50
table typeMyISAM
thread cache size64
thread stack262,144
time format%H:%i:%s
time zoneSYSTEM
timed mutexesOFF
tmp table size67,108,864
tmpdir/tmp/
transaction alloc block size8,192
transaction prealloc size4,096
tx isolationREPEATABLE-READ
updatable views with limitYES
version5.0.91-community
version commentMySQL Community Edition (GPL)
version compile machinex86_64
version compile osunknown-linux-gnu
wait timeout28,800[/size]


My query is as follows:

[size=3]select `ListingID`, `ListDate`, `Description`, `Status`, `City`, `State_Abbrev`, `region_1`, `region_2`, `Address`, `PostalCode`, `Latitude`, `Longitude`, `ListPrice`, `PropertyType`, `Bedrooms`, `Bathrooms`, `main_img`
from `listings`
where `Status` = 'For Sale'
order by `ListDate` desc[/size]


Result: Showing rows 0 - 29 (623,177 total, Query took 7.4942 sec)

Table `listings` structure:

[size=2]FieldTypeNullKeyDefaultExtra
ListingIDbigint(15) NOPRINULLauto_increment
RegionalMLSNumbervarchar(25)NO NULL
ListDatechar(20)NO NULL
LastUpdateDatechar(20)NO NULL
Statusvarchar(25)NOMULNULL
Titlevarchar(255)NOMULNULL
Descriptionvarchar(1000)NO NULL
DetailViewUrlvarchar(255)NO NULL
VirtualTourUrlvarchar(255)NO NULL
Cityvarchar(100)NO NULL
Statevarchar(100)NO NULL
State_Abbrevvarchar(5)NO NULL
region_0varchar(50)NOMULNULL
region_1varchar(50)NO NULL
region_2varchar(50)NO NULL
region_3varchar(50)NO NULL
region_4varchar(50)NO NULL
region_5varchar(50)NO NULL
region_6varchar(50)NO NULL
region_7varchar(50)NO NULL
region_8varchar(50)NO NULL
region_9varchar(50)NO NULL
region_10varchar(50)NO NULL
region_11varchar(50)NO NULL
region_12varchar(50)NO NULL
region_13varchar(50)NO NULL
region_14varchar(50)NO NULL
region_0_slugvarchar(50)NO NULL
region_1_slugvarchar(50)NO NULL
region_2_slugvarchar(50)NO NULL
region_3_slugvarchar(50)NO NULL
region_4_slugvarchar(50)NO NULL
region_5_slugvarchar(50)NO NULL
region_6_slugvarchar(50)NO NULL
region_7_slugvarchar(50)NO NULL
region_8_slugvarchar(50)NO NULL
region_9_slugvarchar(50)NO NULL
region_10_slugvarchar(50)NO NULL
region_11_slugvarchar(50)NO NULL
region_12_slugvarchar(50)NO NULL
region_13_slugvarchar(50)NO NULL
region_14_slugvarchar(50)NO NULL
Addressvarchar(75)NO NULL
PostalCodevarchar(50)NO NULL
Latitudevarchar(50)NO NULL
Longitudevarchar(50)NO NULL
ListPriceint(10)NO NULL
TaxAmountvarchar(15)NO NULL
PropertyTypevarchar(75)NO NULL
Stylevarchar(75)NO NULL
LivingAreavarchar(30)NO NULL
AssociationFeevarchar(15)NO NULL
DevelopmentLevelvarchar(50)NO NULL
YearBuiltsmallint(4)NO NULL
Bedroomstinyint(2)NO NULL
Bathroomstinyint(2)NO NULL
main_imgvarchar(255)NO NULL
main_img_thumbvarchar(255)NO NULL
captionstextNO NULL
imagestextNO NULL
thumbstextNO NULL
highlightstextNO NULL
Emailvarchar(50)NO NULL
WebBugUrlvarchar(255)NO NULL
WebBugScripttextNO NULL
listing_okchar(10)NO NULL
AgencyImgvarchar(255)NO NULL[/size]


I have tried changing some things up, and just when it seems to help, I get that one query that takes like 15 seconds. I know there are sites with millions of records running at a fraction of what my queries do.

Any help would be greatly appreciated.

 

bhukkel




msg:4201344
 7:54 pm on Sep 13, 2010 (gmt 0)

I would change the status field to an integer and do some coding like 1=for sale, 2=sold...

Create an index on status and listdate.

With the explain statement you can see which indexes are used to execute the query, see [dev.mysql.com ]

ferly_man




msg:4201345
 7:59 pm on Sep 13, 2010 (gmt 0)

Thanks for the reply. I am implementing this now.

rocknbil




msg:4201678
 3:41 pm on Sep 14, 2010 (gmt 0)

Correct, numeric lookups will **always** be faster, but when dealing with large record sets you need to normalize your database tables.

The problem, or at least a large part of it, is your tables aren't normalized. Look at this section:

region_0varchar(50)NOMULNULL
region_1varchar(50)NO NULL
region_2varchar(50)NO NULL
region_3varchar(50)NO NULL
region_4varchar(50)NO NULL
region_5varchar(50)NO NULL
region_6varchar(50)NO NULL
region_7varchar(50)NO NULL
region_8varchar(50)NO NULL
region_9varchar(50)NO NULL
region_10varchar(50)NO NULL
region_11varchar(50)NO NULL
region_12varchar(50)NO NULL
region_13varchar(50)NO NULL
region_14varchar(50)NO NULL
region_0_slugvarchar(50)NO NULL
region_1_slugvarchar(50)NO NULL
region_2_slugvarchar(50)NO NULL
region_3_slugvarchar(50)NO NULL
region_4_slugvarchar(50)NO NULL
region_5_slugvarchar(50)NO NULL
region_6_slugvarchar(50)NO NULL
region_7_slugvarchar(50)NO NULL
region_8_slugvarchar(50)NO NULL
region_9_slugvarchar(50)NO NULL
region_10_slugvarchar(50)NO NULL
region_11_slugvarchar(50)NO NULL
region_12_slugvarchar(50)NO NULL
region_13_slugvarchar(50)NO NULL
region_14_slugvarchar(50)NO NULL

Wow. :-) First, does every record have all these values complete or are a lot of them empty? Will many of these records have the exact same value in the same fields? What I'm getting at is this table can be made narrower by parting this off as two other tables and joining on them only when needed.

Listings:

Field Type Null Key Default Extra
ListingID bigint(15) NO PRI NULL auto_increment
(other fields...)
State_Abbrev varchar(5) NO NULL
(Region fields removed ...)
(other fields)

listingRegions:
id int(11) |ListingID bigint(15) | regionID int(11)

So now you have a long list of numeric values to query. Numeric queries will always be lightspeed faster than text queries.

Next you have the actual regions table.

Regions:

id int(11) | regionName varchar(50) | region_slug varchar(50)

(or maybe RegionName and region_slug are the same thing, only one needed, better yet.)

A typical select might be

select Listings.*,Regions.RegionName,Regions.region_slug from Listings,listingRegions,Regions where Listings.ListingID=listingRegions.ListingID and listingRegions.RegionID=Regions.id

The advantages?

- You will only have as many region records in listingRegions as you need for each listing, no overhead.
- The number of regions for each listing are unlimited, but can be as few as one. You are no longer limited to 14 regions.
- Your Regions table can grow as you need it, it can have five regions or five million.
- When you build your select lists in forms for regions, you no longer have to edit select lists, you just pull the values dynamically directly from the Regions table.
- Caveat: Currently when someone is adding a listing, do they enter a textual value or select from a list? If they are entering text values, this plays hell on your searches - UperWest Side, Upper West Side, and upper westside will all be different values and will never match in a user-entered search. The previous approach will eliminate that, and, it will be faster. Instead of

.... where region='Upper West Side'

you have

.... where region='1234'

Again eliminating a textual query. You don't even need to use the join in a user search - just query the value, then query the region table when they click for details.

Anywhere you can eliminate a text value for a numeric value, you should. Another example:

State_Abbrev varchar(5) NO NULL

If these are U.S. states, you only need two characters - another optimization, make your fields only as wide as you need them. Better yet, build a states table

id int(11) |abbrev char(2) |full_name varchar(20)

then in your listings table,

State int(3) NO NULL

And you have another join - PLUS - you have the choice of selection the abbreviated or full name depending on the context. Same is true of countries, if you use them. This makes your queries more complex, but as your database grows you will be glad you did.

Try a search in Google for normalization site:webmasterworld.com for many more helpful threads around here.

ferly_man




msg:4201696
 4:11 pm on Sep 14, 2010 (gmt 0)

Wow, i really appreciate you taking the time to lend a hand!

The `region_0`, etc., do not all have values. Typically, up to about `region_5` or so will contain a value on a average. Just to point out, `region_x` vs `region_x_slug` is like so:

I do my queries against `region_x_slug` as it is a stripped version of `region_x`, which is the "pretty/display" version, ie. `region_x` might equal 'Beverly Hills', where `region_x_slug` would then equal 'beverly-hills'.

I am inspired with casting regions to integers for the sake of querying. I am full-out SEO in terms of displaying the region name in the URL, however, ie. /california/beverly-hills/ as opposed to: /?region=1234, for obvious reasons, otherwise, a numeric value system would have been on the forefront of my mind when I began.

As usual though, things start out really small and escalate very, very quickly, and I'll be the first to admit that I was not fully prepared for the growth.

Again, my biggest struggle when I began with this table was the inconsistency of the data. For starters, I currently receive 99% of my entries from an external feed/source. Not all listings have many regions, so I figured I'd just try and cover the maximum number of regions that might ever be available .. big mistake, I see.

Question:

"Currently when someone is adding a listing, do they enter a textual value or select from a list?"

Answer:

They are only able to select from a drop-down, pre-populated list generated from the db. My thoughts were yours exactly in that allowing the user to enter in their own information would only lead to thousands of discrepancies.

I am going to go through everything you've said again and see if I can implement your recommendations without blowing things up. The database will only grow from here on out, so your recommendations are pretty much a must-do.

Thanks again for your time and help!

I'll follow-up shortly with my progress.

rocknbil




msg:4202834
 4:03 pm on Sep 16, 2010 (gmt 0)

They are only able to select from a drop-down, pre-populated list generated from the db.


Good, now you'll have

<option value="1234">Beverly Hills</option>

instead of

<option value="Beverly Hills">Beverly Hills</option>

What if their region isn't listed? You let them build the list for you by offering an "other region" field which you can *temporarily* store in a table somewhere, associated by the listing ID. When it comes up, if these are moderated (which I'm guessing they should) you can add the new region to your regions table and update their record.

ferly_man




msg:4202853
 4:28 pm on Sep 16, 2010 (gmt 0)

^precisely.

There is a good, scratch that, great chance that I do not have *every* city/region/location on the map, so I do offer the opportunity for a user to submit a location for inquiry. I then manually reference that location to confirm, and input into the db if it's legit.

I made the mistake several years ago of allowing users to enter their own city by way of text-field. I ended up with 20 different variations to each city. It was a nightmare.

In the meantime (while working locally on implementing a new structure), I placed a few indexes in place and the system seems to be working much better.

My only issue with the proposed structure is, how does the system know - based on the location in the URL (for SEO reasons) - where to draw listings from? For example...

url: /united-states/california/los-angeles-county/los-angeles/

That is an example url. Cuurently, I query the db based on the last location in the URL (los-angeles). I use a PHP function that breaks apart the URL to give me an array of all the locations in the URL, with los-angeles in this case, being the last entry in the array. Very simple way of querying the db (... WHERE `region_x_slug` = 'los-angeles' ...) is an example where clause.

How would I now, without changing the URL structure whatsoever, work with a new db structure consisting of numeric values only (as I would no longer be querying based on text-values).

I hope all that makes sense ;)

Thanks for all the replies!

LifeinAsia




msg:4202859
 4:32 pm on Sep 16, 2010 (gmt 0)

Search your table of cities based on the last part of the URL to get the CityID, then search your listings from there.

ferly_man




msg:4202868
 4:39 pm on Sep 16, 2010 (gmt 0)

I will have a go at it. Thank you.

Back to the original post, does anybody see any issues with the setup of the mysql configuration?

The server creds:

CPU = 5.09 GHZ
RAM = 3801 MB

in case those were necessary.

Thanks again, everybody.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Databases
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved