Forum Moderators: open
If it were as simple and consistent as it should be, then of course it would be no problem. Things can never be consistent though, right? I've got a table for each of the above-mentioned elements (besides concentrations which are always different), and they all come together in a lookup table called "programs":
id(int) ¦ program_levels_id(int) ¦ degrees_id(int) ¦ colleges_id(int) ¦ departments_id(int) ¦ majors_id(int) ¦ concentration(varchar)
The major headache is that for display purposes, some programs need to be combined by degree because the coursework is so similar. As an example, the graduate ME in Engineering Management is similar enough to the graduate MS in Engineering Management for them to list it as a MS/ME in Engineering Management. There are quite a few cases where this happens.
Each program is technically different, so that's one of a few reasons that they each need their own row in the programs table. I have decided to create another table that will serve to combine degrees if necessary, and store the data about each program (or combination of programs) such as a description. Here is what the program_data table looks like right now:
id(int) ¦ programs_id(int) ¦ programs_id2(int) ¦ description, etc.
So here's the (rather messy) plan for generating a list of programs:
First it will have to tap into the program_data table. It reads the programs_id and can then use that to extract the level, degree (we'll come back to that), college, department, major and concentration for the particular program.
Now, if programs_id2 is set, that indicates that this row needs to to tie in a second program and get the degree. E.g. - When it comes across the programs_id for MS in Engineering Management it is going find that programs_id2 is set. It looks up the value in programs_id2 in the programs table and finds the ME in Engineering Management. All it really needs to know is the "ME" part, and the final result should return as "MS/ME in Engineering Management).
Somehow I need to tap into the degree abbreviation from the degrees table *through* the programs table, *through* the program_data table.
Here is a query that ignores the programs_id2 column, so it returns all of the data except that:
SELECT program_data.id,
program_levels.name AS program_level,
degrees.abbr AS degree,
colleges.name AS college,
departments.name AS department,
majors.name AS major,
programs.concentration
FROM programs LEFT OUTER JOIN departments ON programs.departments_id = departments.id
LEFT OUTER JOIN degrees ON programs.degrees_id = degrees.id,
program_data,
program_levels,
colleges,
majors
WHERE programs.program_levels_id = program_levels.id AND
programs.colleges_id = colleges.id AND
programs.majors_id = majors.id AND
program_data.programs_id = programs.id
It seems to me that the only way to do this would be to use a subquery where degree.abbr is right now. I tried that, but could not get the subquery down to the necessary single row!
1. Create a view called vwQualTitles thats joins all the tables to contain all the possible fields needed. Use inner joins unless the fields allow null, then use outer right joins with the main table on the left.
2.Make one composite field called "Title" thats suits the 80% rule
2. Include any fields in the view that will help with the decision making about exact title.
3. Use code or presentation logic (not SQL) to solve the non-database problem of displaying the actual title.
If this doesnt fix it, the database probably needs redesigning, or do your subquery thing.
HTH
You could initially populate the ProgramGroup description column by running a query to pull it from the program description column.
This approach has another benefit in that it can combine more than two programs without further modifications.
I've actually re-written the question and stripped out all of the unnecessary stuff, so I might as well post it. As far as I can tell, this forum doesn't have a way to preserve spacing, so I used periods. Hopefully it will make more sense:
Let's say you have three tables - colors, items, and item_display:
colors:
+----+---------+
¦ id ¦ name . .¦
+----+---------+
¦ 1 .¦ red . . ¦
¦ 2 .¦ green . ¦
¦ 3 .¦ blue . .¦
+----+---------+
items:
+----+-----------+-------+
¦ id ¦ colors_id ¦ name .¦
+----+-----------+-------+
¦ 1 .¦ 2 . . . . ¦ grass ¦
¦ 2 .¦ 1 . . . . ¦ apple ¦
¦ 3 .¦ 2 . . . . ¦ apple ¦
¦ 4 .¦ 3 . . . . ¦ sky . ¦
¦ 5 .¦ . . . . . ¦ glass ¦
+----+-----------+-------+
item_display:
+----+---------+----------+-----------------------------------+
¦ id ¦ item_id ¦ item_id2 ¦ description . . . . . . . . . . . ¦
+----+---------+----------+-----------------------------------+
¦ 1 .¦ 1 . . . ¦ . . . . .¦ The grass is green . . . . . . . .¦
¦ 2 .¦ 2 . . . ¦ 3 . . . .¦ Both apples, but different colors ¦
¦ 3 .¦ 4 . . . ¦ . . . . .¦ Sky is blue during the day . . . .¦
¦ 4 .¦ 5 . . . ¦ . . . . .¦ We'll say the glass is colorless! ¦
+----+---------+----------+-----------------------------------+
.SQL file here: <url removed>
Here is the desired output from the final query:
+----+--------+--------+-------+-----------------------------------+
¦ id ¦ color1 ¦ color2 ¦ name .¦ description . . . . . . . . . . . ¦
+----+--------+--------+-------+-----------------------------------+
¦ 1 .¦ green .¦ . . . .¦ grass ¦ The grass is green . . . . . . . .¦
¦ 2 .¦ red . .¦ green .¦ apple ¦ Both apples, but different colors ¦
¦ 3 .¦ blue . ¦ . . . .¦ sky . ¦ Sky is blue during the day . . . .¦
¦ 4 .¦ . . . .¦ . . . .¦ glass ¦ We'll say the glass is colorless! ¦
+----+--------+--------+-------+-----------------------------------+
The fun stuff:
Since some items (e.g. - glass) will not have any color, there will need to be an outer join between colors and items. The following query takes care of that, but does not extract the second color for records that have a second color (e.g. - apples):
SELECT item_display.id,
. .. . colors.name AS color1,
. .. . items.name,
. .. . item_display.description
FROM item_display,
. .. items LEFT OUTER JOIN colors ON items.colors_id = colors.id
WHERE item_display.item_id = items.id;
+----+--------+-------+-----------------------------------+
¦ id ¦ color1 ¦ name .¦ description . . . . . . . . . . . ¦
+----+--------+-------+-----------------------------------+
¦ 1 .¦ green .¦ grass ¦ The grass is green . . . . . . . .¦
¦ 2 .¦ red . .¦ apple ¦ Both apples, but different colors ¦
¦ 3 .¦ blue . ¦ sky . ¦ Sky is blue during the day . . . .¦
¦ 4 .¦ . . . .¦ glass ¦ We'll say the glass is colorless! ¦
+----+--------+-------+-----------------------------------+
I think that extracting the second color will require a subquery. Here is an example of what *looks* right to me, but does not work:
SELECT item_display.id,
. .. . colors.name AS color1,
. . . (SELECT colors.name
. .. . FROM colors,
. . . . . . items,
. . . . . . item_display
. .. . WHERE items.colors_id = colors.id AND
. .. . . . . item_display.item_id = items.id) AS color2,
. .. . items.name,
. .. . item_display.description
FROM item_display,
. .. items LEFT OUTER JOIN colors ON items.colors_id = colors.id
WHERE item_display.item_id = items.id;
The reason it doesn't work is because the "Subquery returns more than 1 row" (to quote the error). Sooo, I think the only way to get it to work would be to somehow "tell" the subquery what the current item.id is for the parent query.
Hopefully this whole thing makes sense, but more examples and information can be provided. I think the example tables are set up the way the need to be, but I am certainly open to structural suggestions. The item_display table is purposefully linking items together rather than directly linking colors. For the sake of this example, a "red apple" is totally different from a "green apple," but they are similar enough to warrant a link when displaying all of the items along with descriptions.
[edited by: encyclo at 1:11 am (utc) on Oct. 14, 2007]
I wondering if that approach is similar to using the program_data table? As far as having columns to store program ids, and a column for the description? In the stripped-down example, would it be similar to the item_display table?
--
Also, as a side note to the stripped-down example.. "colors" represent degrees, "items" represents programs, and "item_diplay" represents program_data. It's too bad about the spacing issues!
You also should probably add to the WHERE clause of the subquery a statement that prevents the same color from being shown twice (color1 & color2).
SELECT item_display.id,
. .. . colors.name AS color1,
. . . (SELECT MAX(colors.name)
. .. . FROM colors,
. . . . . . items,
. . . . . . item_display
. .. . WHERE items.colors_id = colors.id AND
. .. . . . . item_display.item_id = items.id) AS color2,
. .. . items.name,
. .. . item_display.description
FROM item_display,
. .. items LEFT OUTER JOIN colors ON items.colors_id = colors.id
WHERE item_display.item_id = items.id;
+----+--------+--------+-------+-----------------------------------+
¦ id ¦ color1 ¦ color2 ¦ name .¦ description . . . . . . . . . . . ¦
+----+--------+--------+-------+-----------------------------------+
¦ 1 .¦ green .¦ red . .¦ grass ¦ The grass is green . . . . . . . .¦
¦ 2 .¦ red . .¦ red . .¦ apple ¦ Both apples, but different colors ¦
¦ 3 .¦ blue . ¦ red . .¦ sky . ¦ Sky is blue during the day . . . .¦
¦ 4 .¦ . . . .¦ red . .¦ glass ¦ We'll say the glass is colorless! ¦
+----+--------+--------+-------+-----------------------------------+
So I think that it would always display the same color (red for MAX, and blue for MIN)..
Here is what the desired resultset would look like:
+----+--------+--------+-------+-----------------------------------+
¦ id ¦ color1 ¦ color2 ¦ name .¦ description . . . . . . . . . . . ¦
+----+--------+--------+-------+-----------------------------------+
¦ 1 .¦ green .¦ . . . .¦ grass ¦ The grass is green . . . . . . . .¦
¦ 2 .¦ red . .¦ green .¦ apple ¦ Both apples, but different colors ¦
¦ 3 .¦ blue . ¦ . . . .¦ sky . ¦ Sky is blue during the day . . . .¦
¦ 4 .¦ . . . .¦ . . . .¦ glass ¦ We'll say the glass is colorless! ¦
+----+--------+--------+-------+-----------------------------------+
It's kinda like a brain teaser! I know it was fun for me until I got to my wit's end :-]
Thanks john_k... I saw your reply right after I submitted my last one..
I wondering if that approach is similar to using the program_data table? As far as having columns to store program ids, and a column for the description? In the stripped-down example, would it be similar to the item_display table?
With the ProgramGroup idea, there is no cap on the number that can be grouped together.
The program_data table is denormalizing your data and you will find that some queries will become difficult because the program id values are in the same row. For instance, if you want to query the program_data table to see how often a specific program_id is used, you will need to do a count on each of the two foreign key columns and then add them. That is a trivial example, but if you ever have to build in logic based on whether or not a program_id is in use in that table, it will become an extra headache and probably degrade the query performance.
The grouping concept is a standard way to handle parent-child type relationships. There are a lot of examples on the internet.
SELECT item_display.id,
. .. . colors.name AS color1,
. . . (SELECT MAX(c2.name)
. .. . FROM colors c2,
. . . . . . items,
. . . . . . item_display
. .. . WHERE items.colors_id = c2.id AND
. .. . . . . item_display.item_id = items.id AND
. . . . . . .c2.name <> colors.name) AS color2,
. .. . items.name,
. .. . item_display.description
FROM item_display,
. .. items LEFT OUTER JOIN colors ON items.colors_id = colors.id
WHERE item_display.item_id = items.id;
and I'm trying to find what the 80% rule isSorry! All I meant was that usually with stuff like this, programmers introduce unnecessary compleixty to cope with < 20% of the data. 80% usually works with simple queries, but programmers have this arrogance you know of the perfect solution :) Its not a very productive attitude in real life.
I'd just build it to work for the majority of qual titles and then handle the few minority with workarounds in the application code, its just efficient system development.
Good luck anyway!
Added.I just re-read and seems that actually you may have a field missing in the model: "MS/ME in Engineering Management." is a another field in the table called "Generic Title". Just select distinct by that field and the problem is solved.
Using all these ideas.. Here are some solutions that seem to work.
For the colors one :
select i_d.id
, c1.name as color1
, c2.name as color2
, i1.name
, i_d.description
from item_display as i_d
left outer
join items as i1
on i1.id = i_d.item_id
left outer
join colors as c1
on c1.id = i1.colors_id
left outer
join items as i2
on i2.id = i_d.item_id2
left outer
join colors as c2
on c2.id = i2.colors_id
For the academics one:
SELECT pd.id,
pl.name AS program_level,
CONCAT_WS('/', d.abbr, d2.abbr) AS degree,
c.name AS college,
dept.name AS department,
m.name AS major,
p.concentration
FROM program_data as pd
LEFT OUTER JOIN programs AS p ON p.id = pd.programs_id AND pd.type = 3
LEFT OUTER JOIN degrees AS d ON d.id = p.degrees_id
LEFT OUTER JOIN programs AS p2 ON p2.id = pd.programs_id2
LEFT OUTER JOIN degrees AS d2 ON d2.id = p2.degrees_id
LEFT OUTER JOIN departments dept ON p.departments_id = dept.id
INNER JOIN program_levels AS pl ON p.program_levels_id = pl.id
INNER JOIN colleges AS c ON p.colleges_id = c.id
INNER JOIN majors AS m ON p.majors_id = m.id
I think this may actually be a pretty neat reference thread...
[edited by: encyclo at 1:10 am (utc) on Oct. 14, 2007]