Welcome to WebmasterWorld Guest from

Forum Moderators: open

Message Too Old, No Replies

Selecting data from one table that's not in another



2:14 pm on Feb 7, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member

Hi there Guru's,

I posted this same topic last year (almost to the date!) First try! [webmasterworld.com]

I've got an issue that's driving me mad, this code:-

Select serialnumber from
where id_number not in
(SELECT serialnumber FROM newbuild

Works fine on my mock table that's got about 20 rows in it (trialing the code first), then when I try it on my test server which compares the data of one table (serial, approx 6000 rows), to the other (newbuild, approx 10000 rows).

This takes in excess of 1 min. 50 secs to complete, and constantly returns ALL rows from the serial table.

I am stumped; the two rows being evaluated are both primary keys in their repective tables and are both ints.

I've checked the collation of each table and made sure that they are identical in all structure.

Why would it work on 20 rows with a 4 row check where 1 row isn't in the 20 row table!

Is there a more efficient way of checking for data present in on table that isn't in another?

Failing this; I will re design the database to be more fluent and less clunky! I've inherited this system from someone who wasn't that bothered about longevity of task.



8:46 am on Feb 8, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

Have you tried EXPLAIN?

Does using a JOIN as suggested in the previous thread make a difference?


8:38 am on Feb 9, 2014 (gmt 0)

Never use not in, it's terrible. Use an outer join and test for null instead.


2:12 pm on Feb 19, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member

Hi there Guy's,

After MUCH deliberation and the help of going through each row manualy myself I have figured out a potential reason for why this is so slow. Yes the code works well, does what it is meant to (ameliorating my last "returns all rows" comment, it returns exactly what it should, just my interpretation of data was fuddled!), but it seems that there are ALOT more records NOT IN than are in. I was expecting there to be IRO 50-100 records difference between the two tables, but it seems that the ratio is more like 5:1 which is why (I think) there is a huge delay whilst this list is compiled.

If this logic isn't correct, then I suspect that I need to change the query and seeing as "JOIN" is being suggested, I need to try this again..

@DaveWave: I couldn't get JOIN to work, so I gave up for other methods...

@Graemw_p: I'm not sure about explain, could you give me some context on that please? I'm knee deep in a C project at the moment and this has taken a back seat for a few days, but I can't help but still ponder it!

Again, thanks for the advice.



10:10 am on Feb 23, 2014 (gmt 0)

Try this:
Select s.serialnumber from
serial s
Left outer Join newbuild n on n.serialnumber =s.id_number
where id_number is null

Not sure why you have serialnumber in table Serial and you are testing for id_number in table newbuilds, bit confusuing to me but I've just taken your query and converted it to an outer join.


10:17 am on Feb 23, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

In general EXPLAIN will show you exactly what a query does. For example it will show if a subquery is executing just once of repeatedly, whether it is being optimised to a join, etc.


3:26 pm on Feb 23, 2014 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

Make sure your tables are indexed properly- there's no reason it should take so long with such small tables. I do similar joins on tables with millions of rows and get results in less than a few seconds (and most of the time is actually from displaying the results, not the actually pulling of data).

Are you on shared hosting? Things could be slow from some other processing hog(s).


10:47 am on Feb 24, 2014 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

EXPLAIN will also tell you if the indices are being used as expected.

Also, it may help at this point to share (the relevants bits of) your database schema.


10:27 pm on Mar 13, 2014 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member

Hi there Guys,

Thanks for all of your advice on this, I'll report back soon as it will make a huge difference to a project that just keeps getting larger and larger!

I'll check out explain too, looks like it might help.


Featured Threads

Hot Threads This Week

Hot Threads This Month