Let’s suppose I am modeling a business domain with entities A, B and C. These entities have the following properties:
- An entity A can have an entity B and C
- An entity A can have only entity B
- An entity A can exist without B and C
- An entity B has not null property Active
I am implementing the domain with the following SQL. I omit foreign key constraints for brevity.
Now let’s suppose my task is to perform validity check according to special rules. I am given an Id of an entity
- If the entity exists and
- If it is valid
The existence will be checked by simply looking if corresponding row is present in the result set, and for validity check I will write simple CASE
statement. These are my rules for my example data:
- A.1 exists and has active B.10 and has C.100 => exists, correct
- A.2 exists and has inactive B.20 and has C.200 => exists, incorrect
- A.3 exists and has active B.30 and has C.300 => exists, correct
- A.4 exists and has active B.40 and DOES NOT HAVE C => exists, incorrect
- A.5 exists and DOES NOT HAVE NEITHER B NOR C => exists, incorrect
- A.6 does not exist, incorrect
I write the following query to do the task:
My rules include checking if B.Active is true, so I just put this into WHERE
. The result is:
AId Correct ---- -------- 1 1 3 1 4 0
The problem is, I have been given the exact set of Ids of A to check: 1, 2, 3, 4, 5, 6
. But my result does not include 2, 5, 6
. My application logic fails here, because it considers those A records as missing. For 6
this is fine, because it is absent in table A, but 2
and 5
must be present in the results for my validity check. The fix is extremely easy:
Now the result is:
AId Correct ---- -------- 1 1 2 0 3 1 4 0 5 0
It is very easy to understand, that WHERE
is applied to filter all the results, no matter what my intention for JOIN
was. When a record is LEFT JOIN
ed, the condition is not met, because values from B are null. But I still need to have A record in my results. Thus, what I have to do is to include my condition in JOIN
.
It is also very easy to fall into this trap of thoughtlessly writing all intended conditions in the WHERE
clause.