NULLs Data Model and SQL Smells

Nulls - An SQL Smell?
Nulls – An SQL Smell?

NULLs are a perennial problem. Nobody likes them. They confuse developers and users and many analysts do not really understand them.

The concept of NULL allows us to say that there are things that we do not know.

In his article on SQL Smells, Phil Factor associates several smells with NULLs. In this post I’ll explain how to avoid using NULLs and how to use them properly when they are necessary.

Why do we need NULLs at all? What is the benefit and what is the cost?

NULL has a simple meaning with wide-ranging and surprising consequences. NULL means the value is unknown. And this in turn means that the result of any calculation or concatenation which uses this value must also be unknown!

NULL is a feature of SQL. The benefit of allowing NULL or “three valued logic” (TRUE/FALSE/UNKNOWN) is that it allows a database to record that there are things we do not know. The cost of having them is that any calculation or concatenation which uses this value must also be unknown! This confuses many people.

Reasons for needing NULL

There are many reasons why we might not have data to put into a column. Thinking about why we are considering defining a column as NULLable will encourage us to consider alternatives.

The structural NULL – Permanent sub-types

Customer with Sub-types which may cause NULLs
Customer with Sub-types which may cause NULLs

Sometimes we want to combine two entities into a single “super-type” table. There are attributes of Person will never be used for Business (and vice-versa). These missing values will need NULLs.

The structural NULL – Lifecycle subtypes

Order with Lifecycle sub-types which may cause NULLs
Order with Lifecycle sub-types which may cause NULLs

Something similar can happen if we combine all the steps of a entities lifecycle into a single table. The attributes of the later stages will always be empty (NULL) until that stage is reached. These later steps often contain dates or times.

In both cases involving sub-types it may be possible to splitting the sub-types into separate tables. Consider whether it is worth the effort and make sure you avoid the pitfalls of sub-types in SQL.

Data that will never be there

Attributes and Values for an "Address"
Attributes and Values for an “Address”

Entities like “Address” are frequently modelled with attributes like “AddressLine_”. In many cases there will never be values for the later lines. They will not be mandatory in the user interface, but do they need to be NULL? Consider whether allowing them to default to “spaces” or an empty string, would be better and whether it would have any bad effects.

Things which should never allow NULL

Always decide whether you expect an attribute to have a value. Don’t leave it to chance.

There are some things which should hardly ever allow NULL. This includes all keys and identifiers. Avoid allowing short titles or descriptions to be NULL (For long descriptions allowing NULLs is understandable).

Unavoidable NULLs

Attributes and Values for a "Person" Entity - Sometimes NULL is hard to avoid
Attributes and Values for a “Person” Entity – Sometimes NULL is hard to avoid

There are some attributes where allowing NULL is hard to avoid. Life insurance and pensions companies may need a “date of death” for their customers! Having a column with allows NULL is often the easiest way of handling this.

You should resist the temptation to use “magic dates” or inappropriate data-types in order to avoid allowing NULL. The consequences are far worse than the problem.

Summary

NULLs are a problem, nobody likes them but they are necessary. Many problems with NULLs can be avoided by two rules:

  • Remember that NULL means “unknown value” and this has consequences.
  • Ask “_why_ don’t we have this data?”

In many cases NULLs can be avoided by data modelling – that means the analyst has to do work in the Conceptual or Logical Model.

Where next?

The next article is about another smell: having the same name for different things!

Another video – Splitting an Access database

When I’m making something I sometimes learn new things. They say “you should never stop learning”. I agree with “them”, whoever they may be.

While I was working on the SOPAG project, I investigated “splitting” the Access database into:

  • Logic and presentation, and
  • Data (database definitions and data values)

Components. I knew this could be done, but I had not spent much effort on it before.

The splitting itself was a straightforward enough exercise. Most of the work is done by Access itself. However, you might want to confirm that all the decisions it has made are sensible!

I decided to document the results for my own benefit, and then decided to convert a scrappy Powerpoint presentation into something a little more presentable to upload to YouTube.

Here it is: Splitting an Access database

I had fun doing the work to find out how it worked, and fun making the video. I hope you get something from watching it.

Collaboration software and badgers on the internet

A little earlier this week I attended a Webinar on PBWorks ProjectHub product.

Here is a link to a recording of the webinar:

I have used PB Works’ wiki product for years (anyone who remembers me from LCCH may remember me setting up a reference site which was based on PBWiki).

Right now, I’m using the “Freemium” version of ProjectHub to manage a small project I’m running. The project team is split across two countries (Ireland and Wales), so the opportunities of meeting face to face are limited, but thankfully we’re all in the same timezone and all speak more-or-less the same language!

It’s all going reasonably well, and I started to ask myself “why?”

I’ve used various bits of collaboration software on various different platforms for years now. Sometimes it works well for the project, sometimes it works less well.

With ProjectHub, I like the way that I can switch between a top-level view, to a short term  “what is the next focus” view, to an individual task view quickly.

One of the things which helps my team, is that we’ve known one another for quite a while and we’ve agreed:

  • the way that we are going to use the tool,
  • what our roles are, and
  • what our individual responsibilities are.

In order to keep things running smoothly, we have a role which I call “the badger”. The badger’s job is to spot when people have forgotten to complete updates (usually because they’ve been doing something more important and more interesting) and remind them. The good thing about something like ProjectHub is that the actual effort of adding a couple of line comment on a task, or ticking the “complete box” is easier than arguing, so things remain reasonably up to date. The badger doesn’t have to be the PM (right now it is me though), in fact, it’s better if it is someone else, because even the PM needs to be badgered sometimes!

Almost my first video on Youtube

Sometimes things don’t go quite as I intend. A little while ago, someone approached me with a potential project. Unfortunately I was too busy at the time to take it on.

The idea had tickled my fancy. It bubbled away in the back of my mind and as I had odd moments I created bits of it, as what I would describe as a proof of concept. It was a useful exercise in that it has reminded me of a few things, taught me some things about what Microsoft Access is good at and some things it is not so good at. Inevitably, there are some things I would do differently if I did it again. That’s all right, because after all it was only a proof of concept, and there was no real input in the form of “requirements” anyway.

Having produced the thing, then I wanted to show it to an acquaintance. I messed about with a few things and after a couple of iterations, produced this:
SOPAG – A simple Access Application

Having produced the video, and decided to write a “business related blog” it seemed appropriate to share it here.

I wouldn’t claim either SOPAG, or the video are marvelous, but I’ve learned a lot from both of them. In fact, I have set up a little project to take them both a little further.

But that is for the next instalment!