Welcome to WebmasterWorld Guest from 54.166.46.226

Forum Moderators: open

Message Too Old, No Replies

Exclude rows from query if they match a subquery

   
12:45 am on Mar 9, 2008 (gmt 0)

5+ Year Member



Hello,

What I would like to do is to run a query such as:

SELECT * FROM articles WHERE published>DATE_SUB(CURDATE(),INTERVAL 24 HOUR) LIMIT 10

While at the same time not including any results that match:

SELECT * FROM articles WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY) AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY) LIMIT 5

Both queries work by themselves, but I need a way to put them together.

The ending result should have 10 rows, not 10 from the first query minus 5 from the second query if there were any matching rows.

Any ideas how I can go about putting these together? Thanks.

5:21 pm on Mar 10, 2008 (gmt 0)

5+ Year Member



Is this done in PostgreSQL?

You can try this:


SELECT *
FROM articles
WHERE EXISTS (
SELECT *
FROM articles
WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY)
AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY)
LIMIT 5 )
AND published > DATE_SUB(CURDATE(),INTERVAL 24 HOUR)
LIMIT 10;

But generally it would be easier to do something like:


SELECT *
FROM articles
WHERE articles.PK IN (
SELECT DISTINCT PK
FROM articles
WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY)
AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY)
LIMIT 5 )
AND published > DATE_SUB(CURDATE(),INTERVAL 24 HOUR)
LIMIT 10;



Hope that helps,
vol7ron




5:39 am on Mar 16, 2008 (gmt 0)

5+ Year Member



Thank you for your reply.

However, I'm running into a problem. The first set doesn't seem to be excluding any results, and visually it just seems to be a WHERE x AND x AND x type of statement.

The second set of code doesn't appear to work as is in my MySQL environment. It does if I take out everthing from "articles.PK" to ") AND". Are these settings allowed in MySQL?

What I have is a list of new articles on my site on one side, and a list of special articles on the other side. Even though these two lists might collide, I want to make sure I exclude everything from one query that might be in the second query.

Its important that all duplications are removed since I'm using things like id attributes form content in the MySQL rows, among other reasons. If the ids are used more than once, other issues arise.

2:50 am on Mar 22, 2008 (gmt 0)

5+ Year Member



I'm still stuck on this part of the code. If anybody can help, I'd greatly appreciate it.

Again, I'm trying to exclude results from one query everything matching a second query.

1:26 am on Mar 23, 2008 (gmt 0)

5+ Year Member



I suspect you're using MySQL. In that case an analysis shows that the two queries are performed on the same table `articles` .

If you have another table of special articles different that another table of "normal" articles you should use two different names, this would reduce confusion.

On another point of view, a close observation shows that the two queries are similar. The first means "Show me articles that have been inserted since the last 24 hours included". The second query is almost the same as the first one except that if you use DAY as INTERVAL the time part in the result is missing. The second query also means that you don't want articles inserted during the current day to be included while this is allowed in your first query. The second query is likely to return less rows(at least if you omit the LIMIT clause) than the first one.

MY SUGGESTION: Combine the two queries in one like this:
"SELECT * FROM articles WHERE published >= DATE_SUB(CURDATE(), INTERVAL 1 DAY) LIMIT 10 ORDER BY published DESC";

This would return 10 articles inserted during the last 24 hours beginning by the most recent one while in your first query you'd have gotten the articles inserted during the last 24 hours beginning by the oldest one and I don't think this is what you want since a best practice would be to show the most recent news.

If this helps, please let me know.

4:34 pm on Mar 30, 2008 (gmt 0)

5+ Year Member



Thanks for the reply,

Let me clarify to avoid confusion.

Yes, you are correct. I am in a MySQL environment.

Imagine I have a table with a list of everyone that attends an school.

I need to make a list of the 40 newest students at the school so they can attend a special student orientation.

However, at the same time, I have a special assembly for all the top ten students from the school by gpa.

I can easily make a list of the 40 newest students, and I can easily make a second list of the top ten students.

My problem is, sometimes a student will be one of the 40 newest students to the school, as well as a top ten student. The student can't go to both the orientation and the special assembly. It is more important for this student to attend the orientation than the assembly.

So what I need to do is to make a list that excludes results from one query (40 newest students) everything matching a second query (top ten students), so I can make a list of the top ten students without including any of the newest students.

4:17 am on Apr 2, 2008 (gmt 0)

5+ Year Member



I changed you select * to an imaginary article_id, ar_id column. What this does is makes sure that any ar_id that is matched in the subquery, is excluded.

SELECT ar_id FROM articles WHERE published>DATE_SUB(CURDATE(),INTERVAL 24 HOUR) and ar_id not in (SELECT ar_id FROM articles WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY) AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY) LIMIT 5) LIMIT 10

Be careful, sub queries in MySql can be painfully slow, so make sure you got your indexes correct.

12:00 am on Apr 5, 2008 (gmt 0)

5+ Year Member



Thank you for the reply,

This is very interesting. Would this code require me to create an ar_id or article_id column? Else, how would it know what these values are?

4:44 pm on Apr 5, 2008 (gmt 0)

5+ Year Member



With all the help given, I'm getting close, but I'm having trouble with a putting a limit on my subquery.

The following works, but yields no results because there are none.

SELECT * FROM articles WHERE published>DATE_SUB(CURDATE(),INTERVAL 24 HOUR) AND hash NOT IN (SELECT hash FROM articles WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY) AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY)) LIMIT 10

This would yeild results, but by adding the limit, it won't run:

SELECT * FROM articles WHERE published>DATE_SUB(CURDATE(),INTERVAL 24 HOUR) AND hash NOT IN (SELECT hash FROM articles WHERE published>=DATE_SUB(CURDATE(),INTERVAL 1 DAY) AND published<DATE_SUB(CURDATE(),INTERVAL 0 DAY) LIMIT 5) LIMIT 10

Why am I not allowed to put a limit in a subquery?