Where to put condition in SQL?

Let’s suppose I am modeling a business domain with entities A, B and C. These entities have the following properties:

  • An entity A can have an entity B and C
  • An entity A can have only entity B
  • An entity A can exist without B and C
  • An entity B has not null property Active

I am implementing the domain with the following SQL. I omit foreign key constraints for brevity.

Now let’s suppose my task is to perform validity check according to special rules. I am given an Id of an entity A as an input and I have to check:

  1. If the entity exists and
  2. If it is valid

The existence will be checked by simply looking if corresponding row is present in the result set, and for validity check I will write simple CASE statement. These are my rules for my example data:

  • A.1 exists and has active B.10 and has C.100 => exists, correct
  • A.2 exists and has inactive B.20 and has C.200 => exists, incorrect
  • A.3 exists and has active B.30 and has C.300 => exists, correct
  • A.4 exists and has active B.40 and DOES NOT HAVE C => exists, incorrect
  • A.5 exists and DOES NOT HAVE NEITHER B NOR C => exists, incorrect
  • A.6 does not exist, incorrect

I write the following query to do the task:

My rules include checking if B.Active is true, so I just put this into WHERE. The result is:

AId  Correct 
---- --------
1    1       
3    1       
4    0       

The problem is, I have been given the exact set of Ids of A to check: 1, 2, 3, 4, 5, 6. But my result does not include 2, 5, 6. My application logic fails here, because it considers those A records as missing. For 6 this is fine, because it is absent in table A, but 2 and 5 must be present in the results for my validity check. The fix is extremely easy:

Now the result is:

AId  Correct 
---- --------
1    1       
2    0       
3    1       
4    0       
5    0       

It is very easy to understand, that WHERE is applied to filter all the results, no matter what my intention for JOIN was. When a record is LEFT JOINed, the condition is not met, because values from B are null. But I still need to have A record in my results. Thus, what I have to do is to include my condition in JOIN.

It is also very easy to fall into this trap of thoughtlessly writing all intended conditions in the WHERE clause.

One comment

  1. Dan says:

    imho
    1. The list of requirements, something like ‘not null’, becomes a list of characteristics at the moment when there is a possibility to commit such data into a database.
    You wrote the ‘properties’ word but it looks like ‘it should be in this way’ that I consider as a requirement (or validation on commit).
    Client application cannot consider these properties as a part of database integrity checks/configs and should work with them as with ‘data can be configured in this way as well as in other ways’.
    This is why you move some code into the ‘select’ statements: it’s the client’s responsibility to decide what is wrong and what is correct.
    I personally don’t like such approach. However, I will follow your way if it is a change in end-user requirements.
    2. The ‘AS Correct’ statement looks common for each client (desktop/web/mobile/internal/cmd or any other) and I would prefer to move this calculation at the database side (triggers or calculated fields).

Leave a Reply

Your email address will not be published. Required fields are marked *

Protection against spam * Time limit is exhausted. Please reload CAPTCHA.