Welcome to WebmasterWorld Guest from 54.162.184.214

Forum Moderators: open

Message Too Old, No Replies

Oracle/SQL Brainteaser

find records that aren't there?

     
9:36 pm on Jan 24, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


I have a list (not in a db table anywhere, just in excel) of 50 items that COULD be in a database table, but that might not be. The table contains millions of records. I need to find out which records on my list of 100 are missing from this table, so I can ask a third-party to supply the missing info. For the list of 100, I have a unique identifier (call this column product_id if you like) to compare against.

How do I write a SQL query to get at this info?

I have asked a couple people that know SQL pretty well, and no one has come up with an answer for me that doesn't involve creating a lookup table to join off of, or writing a script. I'm looking for a straight SQL (maybe PL/SQL, depending) answer to this, and I don't want to compare by hand, eliminating them one at a time.

Just so you don't have to create a list of fake product_id's, here's my query that retrieves the fake product_ids:

SELECT product_id FROM product_info WHERE product_id IN('972039', '972050', '972068', '972084', '972153', '972280', '972365', '972449', '972557', '972697', '972634', '972666', '972691', '972694', '972723', '972729', '972734', '972759', '972765', '972767', '972789', '972791', '972794', '972802', '972821', '972822', '972844', '972849', '972902', '972903', '972906', '972907', '972925', '972949', '972952', '973004', '973023', '973026', '973063', '973075', '973076', '973079', '973081', '203346', '984207', '985596', '986493', '994683');

9:52 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 13, 2002
posts:2173
votes: 0


Am I missing something here? you just need to use WHERE NOT IN :

SELECT product_id FROM product_info WHERE product_id NOT IN (your list here)

10:03 pm on Jan 24, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Yeah - you are missing something. But don't worry - that's the answer everyone (I've asked quite a few people) comes up with at first. :)

Your query would return a list of millions of records. I only want the handful of records that are on my list, but NOT in the table.

10:25 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 18, 2003
posts:1929
votes: 0


I am not sure if Oracle has this, but MySQL has an EXISTS and NOT EXISTS [dev.mysql.com] subquery.
10:30 pm on Jan 24, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Oracle has it, but I can't figure out how to make it do what I'm trying to do.
10:42 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2610
votes: 0


If you can do this in a programming lauguage that you can execute sql then try something like this:
******************************************

your_list = [the array of ids to lookup]
not_in_table_list = []
counter = 0
max_records = len(your_list)

while counter < max_records:
..result = executeSQL(select * from table where id = your_list[counter])
..if not result:
....not_in_table_list.append(your_list[counter])
..counter += 1

return not_in_table_list

*****************************************

This will return an array of ids that were not in the table. This is obvious sudo code but it will work if translated properly.

I know you said you don't want to use a script just straight SQL but unless you populate a table with the list of ids and then join of that then it won't work. Why? Because you siad yourself you want it to return things that are not in the table. Well you are going to have trouble return something that isn't there. It can only select records from a table, how can it select records that aren't in the table?!?!?

You are asking the sql to remove things from your parameter list(the array you are passing in) and you want it to return a modified version of the array being passed in. I am positive that is not a capability of SQL and if it is please show me when you get it working because I can think of a few 100 places that that would optimize some of my code.

You are asking this basically:

<params=the_array>
select
..the_array
from
..NOTHING
where
..the_array not in (select id from table where in (the_array))

As you can see SQL is not going to be able to pop the things that weren't returned from the array and return said array.

Hope this helps.

[edited by: Demaestro at 10:59 pm (utc) on Jan. 24, 2006]

10:52 pm on Jan 24, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Thanks for the reply.

I could write a script, and get the result I need. Just like I could create another table to compare values with a join. But this is as much about learning SQL as it is about "getting the answer". I could do this lots of different ways, but to me the most straight-forward way of finding out this information involves only SQL.

Isn't the purpose of SQL to get info out of a database? That's all I'm doing here.

11:01 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:June 13, 2002
posts:2173
votes: 0


But the data you need to select from isnt in a database :)

An SQL solution is to create a temp table, insert the list, do a left join & select where not null then drop the table. That involves only SQL.

[edited by: aspdaddy at 11:06 pm (utc) on Jan. 24, 2006]

11:05 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2610
votes: 0


That was what I was saying, I guess I am a little mnore long winded.
11:08 pm on Jan 24, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Thanks guys, maybe what I am asking is impossible with SQL. I assumed I just wasn't familiar enough with the language, but maybe that's not the case. I'm still hoping that some SQL-master will shout the solution from atop a server somewhere. But maybe that's not gonna happen.
11:19 pm on Jan 24, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


I haven't got an Oracle installation handy, but in PostgreSQL 8.0 and later you can do this:


SELECT *
FROM generate_series(972000,990000) AS check_id
LEFT JOIN product
ON product_id=check_id
WHERE check_id IN ('972039', '972050', '972068', '972084' ...etc)
AND product_id IS NULL

If Oracle has a function similar to generate_series() which will create a series of numbers as a dynamic pseudo-table, you can do what you want in one statement, although I don't know what the performance will be like.

11:21 pm on Jan 24, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2610
votes: 0


Jackson_Hole

Ok I have it. You are right SQL can do this but not sure if your DB will support this but I did a test and it works in mine.

****************************************
<param>array</param>

select
..temp_table.id as the_missing_ids
from
..(select array[0] as id union all select arrayunion all select array[2] union all select array[and so on..]) temp_table
where
..temp_table.id not in (select id from table)
****************************************

Now keep in mind that the temp_table must be generated dynamically before the SQL is executed so that all of the elements in the passed it list are unioned as id in the temp table.

[1][edited by: Demaestro at 11:26 pm (utc) on Jan. 24, 2006]

11:23 pm on Jan 24, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


Oracle's ROWNUM might be of use here [as a generate_series equivalent], though I can't think of a way to apply it without having an installation to play on...
12:01 am on Jan 25, 2006 (gmt 0)

Senior Member from CA 

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month

joined:Nov 25, 2003
posts:1130
votes: 280


Possibly something like:


SELECT DISTINCT product_id
FROM product_info
WHERE NOT EXISTS
(SELECT * FROM product_info WHERE product_id IN('[your-list]'));
12:05 am on Jan 25, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2004
posts:393
votes: 0


Here is a pure SQL solution. It's not pretty but it works (I only used six ids for brevity). It makes use of a derived table which perfectly acceptible SQL.


SELECT missing_id
FROM
(SELECT '972039' AS missing_id
UNION
SELECT '972050'
UNION
SELECT '972068'
UNION
SELECT '972084'
UNION
SELECT '972153'
UNION
SELECT '972280') AS missing
LEFT OUTER JOIN product_info
ON product_info.product_id = missing.missing_id
WHERE product_info.product_id IS NULL
12:25 am on Jan 25, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Oct 1, 2004
posts:607
votes: 0


Iamlost: your query will return either no or all rows from the table product, because the subquery produces a simple boolean result. Translated into English it reads "give me every product id but only if there are no ids in the table products which are also in the list".
1:29 am on Jan 25, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2610
votes: 0


syber has the same code as I do, the trick is dynamically creating the list within the SQL statement before it executes.

You have to realize what I and Zcat and everyone are saying. You cannot return results from a table if they are not being stored in a table.

Imagine that you have an integer field and it has 1 thru 100 for values in it. If you ask it to return the integers that are not in that table then it will have an infinate amount of results because that's how many numbers there are.

You can't ask it to return what isn't there. So in the code that syber and I have you create a temp table in the from clause and alias it and then return results from that temp table. It is the only way.

1:38 am on Jan 25, 2006 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2004
posts:393
votes: 0


Demaestro

My solution does not create a temporary table. It creates a psuedo table in memory. This is little different than using the IN clause, just substitute the word UNION for the commas :)

4:13 am on Jan 25, 2006 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 15, 2003
posts:2610
votes: 0


Suedo table in memory, temporary table same poop. The only difference in our code is I have unino all and you have union and you outputed the value of the variables in your example. Same poop.
6:52 am on Feb 7, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Sorry for the delay - I tried all of your solutions, but couldn't get any working. I'm on Oracle 8i, accessing via iSQL+ (web-based interface), and don't have create/update/drop privs, thus the need to do this without creating a table. I can (I think?) run PL/SQL scripts with variables and the like, but I've no idea how that would play out.

(And: I'm with you Demaestro - same poop! But I couldn't actually figure out how to make your script work. It looks like you've written pseudo code? Sorry - I'm decent with basic SQL, but beyond that...)

3:00 pm on Mar 3, 2006 (gmt 0)

New User

10+ Year Member

joined:Feb 26, 2006
posts:39
votes: 0


Hi,
This is very much possible,
but since you dont have create/drop privileges you need to go the long way about this.

in 9i+ you would create a function using the pipelined - pipe row to return a "fake" table listing from a comma delimited list in a large string.
with table create privs you would simply create a table with 1 column and reference it instead of the multi unions from dual.

This would give you the correct answer but create the huge union can be a headache.

select *
from ( select '972039' as c1
from dual
union
select '972050'
from dual
union
select '972068'
from dual
union
select '972084'
from dual
union
select '972153'
from dual
union
select '972280'
from dual
union
select '972365'
from dual
union
select '972449'
from dual
union
select '972557'
from dual
/* only 9 items shown for clarity */ ) t1
where not exists ( select 1
from products
where product_id = t1.c1 )

10:32 pm on Mar 3, 2006 (gmt 0)

Administrator

WebmasterWorld Administrator jatar_k is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:July 24, 2001
posts:15756
votes: 0


to be perfectly honest (yes I understand this is a purist discussion) I would script it

you could
read the ids from your file
loop through each checking to see if there is a record
discard those where you find one
compile the unfound ones
then you could even have it do your request for additional info

fully automated solution

just my 2 cents ;)

1:46 am on Mar 20, 2006 (gmt 0)

New User

10+ Year Member

joined:Nov 19, 2005
posts:26
votes: 0


Rufal, Jatar_k, you both win the grand prize. Rufal, your query does exactly what I need. Jatak_k, your suggestion to script this lead me to think that I should just write a simple ColdFusion app to handle this automatically...

Rufal, we're in the midst of a transition to 10g, is there a better way to do this once we make the switch? you referenced another option on 9i, I assume 10g will also give me that option and maybe more?

Thanks, everyone, for your help!

2:04 am on Apr 1, 2006 (gmt 0)

New User

10+ Year Member

joined:Feb 26, 2006
posts:39
votes: 0


Hi,
Yes, from 9i+ there is a new way to return data from a procedure/function. It basically allows you to do a SQL select from a function. (the pipelined/row feature)

so you can do a
select * from func('1','2','3','4',....);

and it will return x rows, depending on the number of values.

But I guess that your option would be to use external tables, especially if you will not have create privileges in the new version.

External table feature, works as a normal table within the database except the source comes from a delimted file residing in the OS. (tab, comma, semicolon, etc.. you predefine the format, works like a charm)

So you can update a flat file and afterwards do some processing within the database using the data in the flat file.

I have used this method to solve very complex data interoperability between systems that do not have direct data transfer abilites.

(edited wrong syntax in sql example)

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members