In his article on SQL Smells, Phil Factor is not in favour of using Check Constraint to limit the values in columns.
In my previous post I explained why Phil Factor recommends referential integrity. I am going to explain this apparent contradiction.
What are CHECK CONSTRAINTs and what do they do for us?
Like REFERENCES, the CHECK CONSTRAINT also says what are values can be stored in a column. The logical expression must be true for the value to be allowed.
The logical expression can be limits or even a list of permitted values (as in the illustration).
What is the problem with Check Constraints?
This sounds like a marvellous idea! The benefit is clear. Constraints will exclude invalid data from the database, even when it is loaded using a utility (eg BULK INSERT). We can define constraints which will protect the database from bad data. So what is the problem?
Why does Phil Factor regard this as an SQL Smell? The answer is that using Check Constraints in this way embeds constant values in the structure of the SQL database!
The database structure is “locked down” in commercial systems. Only authorised people are allowed to make changes to the structure. Phil Factor wants to avoid these changes.
When should we use Check Constraints?
This criticism does not mean that we should never use Check Constraints. The ideal constraint will not change in the lifetime of the database.
- For attributes which represent “classifications” and “types” we should note how many different values we are expecting, and how frequently the allowed values change. Short lists which change very rarely may be acceptable.
- On the other hand, consider re-designing a CHECK CONSTRAINT as a FOREIGN KEY by adding an additional table to contain the valid values. This has the benefit of making adding a new value a simple data change!
- Do not use Check Constraints to enforce arbitrary limits.
Check Constraint and Requirements
We can identify candidates for Check Constraint when we construct the Conceptual Model. We should note:
- The number of options
- The expected frequency of change.
- That information will enable us to make an informed decision about how to validate the values of that column.
Unfortunately, some of the examples using Check Constraints perform the checks against arbitrary values. These examples will work technically but copying them may cause the problems Phil Factor wants us to avoid.
Summary
Check Constraints provide a way of validating data values. They are appropriate for checking against values which do not change.
For lists, lookup-tables with Foreign Key Constraints may be better.
Do not use Check Constraints against arbitrary values or values which change frequently.
Where next?
The next article covers “Indexes”. I will explain how an Analyst can influence some design decisions.