Requirements Smells make SQL Smells!

 

Your SQL smells because your Requirements smell!
Your SQL smells because your Requirements smell!

Recently I read an article by Phil Factor on the subject of “SQL Smells”. Phil (apparently not his real name), identifies a number of “smells” which he thinks indicate that a database design or SQL code needs to be reviewed. He classifies some of these as “Problems with Database Design”. I would go further and say some of them are problems with database requirements! In other words, your SQL smells because your Requirements smell!

“Requirements Smells cause SQL smells!”

I no longer claim to be a “Developer” and I have never claimed to be a DBA (Database Administrator), though I have found myself in the position of being an “accidental DBA”. The thought that Requirements could smell bad concerned me.

This realisation made me think about problems with Requirements in general and problems with databases in particular. It is better to avoid a problem rather than cure it, so I’m writing a series of blog posts on how to recognise problems in Requirements and prevent them from becoming “SQL Smells”.

Database design and SQL smells

Any computer system contains a “model” of the world it works with. This model forms the foundations of the system. If the system does not contain a concept, then it cannot work with it!

A simplified database design process
A simplified database design process

When people start to create a system they have to decide what concepts their system needs. This is the “Conceptual Model”. This model is transformed through a “Logical Model” until it finally becomes the “Physical Model”, which is the design for the database. The Conceptual and Logical models are not just first-cut versions of the Physical Model, different design decisions and compromises are made at each stage.
This is nothing to do with “Waterfall”, “Agile” or anything to do with any specific development process. In fact, this approach is pretty universal, whether formally or not. Some people combine the different stages, but there are risks to doing that.

A simple way of looking at the Conceptual Model is to say that it is concerned with finding out:

  • What the business and system need: at the conceptual stage these are known as “Entities”
  • What we need to know about those things: these are the “Attributes” of the Entities
  • We also need to document “Business Rules”: some of these will be represented as “Relationships”.

During the design and development process:

  • Entities will tend to become table definitions
  • Attributes will become the columns within those tables
  • Business Rules may become so-called “constraints”.
Different Requirements become affect different aspects of the database
Different Requirements become affect different aspects of the database

A poor Conceptual Model or bad design decisions can lead to systems which are difficult to build, maintain and use, and which do not perform well either. Once again,

“Requirements Smells will cause SQL Smells”

The idea of “smells” can help us address potential problems earlier and more cheaply.

Where are these “Requirements smells”?

I’m going to group my bad smells in a slightly different way to Phil Factor. I primarily work as a Business Analyst, so I am going to concentrate on “smells” to look for at the Conceptual and Logical Stages of specifying the Requirements for a database, starting with the smell that Phil describes as “The God Object”!

Geek at speed! A ride in a Radical SR8 racing car.

I recently spent a few days in England on business. On my way home, I visited friends and business contacts I don’t see often enough. I thought I would share the achievements of a friend who doesn’t get the recognition he deserves.

This is a Radical SR8. It’s a track-day car or a road-going racing car, and is made in Peterborough. You you won’t see many on the road because they have been successful racing. (Top speed 178 mph)

Side view of Radical SR08
Side view of Radical SR08

My friend Steve Prentice designed the engine. The engine is two Suzuki Hayabusa cylinder blocks sharing a common crankshaft. The SR8 is a 2.6 litre V8 sports motorcycle engine in a lightweight chassis with an aerodynamic body! Steve describes the engine as his “best work so far”. The SR8 has held the road-legal lap record at the Nurburgring since 2009. (lap time 6m 48s which, if my calculations are correct, is and average of 112.9 mph!)

I normally avoid showing number-plates and other identifiable things on the internet but I’m making an exception in this case. This particular car is the 2009 specification and was used by Radical as a demonstrator. Steve bought it from Radical a few years ago. He has made a few modifications to the car to make it a little more practical on the road. He uses the car for advertising and for “testing things”.

View of the engine bay of Radical SR8
View of the engine bay of Radical SR8
Front view of SR08RAD Radical SR8 road-going track car.
Front view of SR08RAD Radical SR8 road-going track car.

A visit to Steve’s office, lair of the SR8

Steve lives with his wife near Banbury. His office is rented space in a barn in the middle of a field a short distance from his home. When I arrived at the office Steve was completing an overhaul of the SR8. That’s Steve with the bald head and the overalls!

Steve says the performance of the SR8 owes as much to the chassis and the brakes as to the engine. As you can see the chassis is a space-frame. The brakes are just something else!

I think Steve took pity on me or was fed up with me hanging around. He employed me to fetch and carry tools and to hold spanners. I even provided unskilled help filling it up with oil. With an SR8 even filling it with oil and water is not straightforward. Both lubrication and cooling systems have several potential airlocks. The filling process involves: filling, starting the engine, bleeding, stopping, checking the level and then repeating…

A sedate spin in the countryside

My reward (or punishment?) for being an unskilled labourer cum spectator, was a 40 minute test spin in Steve’s SR8. I needed instructions on how to get into the passenger seat. I also needed help with the 5 point harness.

The SR8 has no windscreen. Before we set off, the pair of us dressed up in black balaclava ski masks and perspex safety specs. Steve said that “we looked like a pair of bank robbers making a get-away”!

The performance of the engine, handling and brakes is everything you would expect. Steve was mostly driving on single carriageway roads. I couldn’t see the instruments, so I have to assume that the maximum speed we reached was 59.5 mph. The acceleration is literally breath-taking. The engine delivers not so much a “kick up the arse” as an “aggressive shove in the small of the back”, and the brakes are equal to it. The handling is excellent too. We hadn’t fitted the floor-pan under the engine bay so that any minor oil leaks would become obvious. As a result Steve said that he could feel a certain lack of down-force. I definitely didn’t notice anything!

I told Steve to pose like a proud father, so here he is. He goes to track days and race meetings, so if you see him with the car, introduce yourself. If you tell him you know me, he may give you a more detailed explanation of how the car is put together.

Steve Prentice, designer of the engine in the SR08
Steve Prentice, designer of the engine in the SR8

Will losing constraints set you free?

I’ve been busy with a project, I’ve finally got round to writing this a week later than I intended…
In a recent conversation, someone pointed out that people sometimes remove “constraints” from a database in order to improve performance. This made me ask myself:
Is this a good thing, or a bad thing?
I have to admit that this is a technical change that I have considered in the past. Never-the-less, I have mixed feelings about it.
After some thought, my opinion is:
  • For many situations a constraint is redundant. The fundamental structure of many applications means they are unlikely to create orphan rows.
  • The cost of the constraint is in the extra processing it causes during update operations. This cost is incurred every time a value in the constrained column is updated.
  • The benefit of a constraint is that it absolutely protects the constrained column from rogue values. This may be particularly relevant if the system has components (such as load utilities or interfaces with other systems) which by-pass the normal business transactions.
  • Other benefits of constraints are that they unequivocally state the “intention” of a relationship between tables and they allow diagramming tools which navigate the relationships to “do their thing”. Constraints provide good documentation, which is securely integrated with the database itself.
In short:
  • The costs of constraints are small, but constant and in the immediate term.
  • The benefits of constraints are avoiding a potentially large cost, but all in the future.
It’s the old “insurance” argument. Make the decision honestly based on a proper assessment of the real risk and your attitude to taking risks. Be lucky!

More Detailed Argument

For those who don’t just want to take my word for it. Here is a more detailed argument.
Let’s take the “business data model” of a pretty normal “selling” application.
When we perform the activities “Take Order” (maybe that should be “Take ORDER”), or “Update Order”
  • we create or update the ORDER and ORDER_LINE entities, and
  • in addition we refer to PRODUCT (to get availability and Price) and presumably to the CUSTOMER entity which isn’t shown on the diagram.
When I translate this into a Logical data model, I impose an additional rule “Every ORDER must contain at least 1 ORDER_LINE”. The original business model doesn’t impose this restriction.
Remember some people do allow ORDERs with no ORDER_LINES. They usually do it as part of a “reservation” or “priority process” which we are not going to try and have here.
When the transaction which creates the ORDER and ORDER_LINE makes it’s updates, then it will have read CUSTOMER and ORDER, so it is unlikely to produce orphan records, with or without constraints.
On the other hand, by having the constraints we can document the relationships in the database (so that a diagramming tool can produce the ERD diagram (really I suppose that should be “Table Relationship Diagram”)).
I am left wondering whether it would be possible or desirable to enforce my  “Every ORDER must contain at least 1 ORDER_LINE” rule. I’ll think about that further. (Note to self: Can this be represented as a constraint which does not impose unnecessary and unintended restrictions on creating an ORDER?)
If we don’t have constraints and we have something other than our transaction which is allowed to create ORDERs and/or ORDER_LINEs (As I said, typically this would be an interface with another system or some kind of bulk load), we have no way of knowing how reliably it does it’s checking, and we might be allowing things we really do not want into our system. Constraints would reject faulty records and the errors they created (or “threw”) could be trapped by the interface.