News - Page 4 of 33 - Duhallow Grey Geek

Problems with “God Objects” or very wide tables

Wide Tables, Wide SQL Tables — Wide SQL Tables

One of the “SQL smells” Phil Factor identifies in his article is the presence of “God Objects” in your Database or design. I agree with him, except that I would call them “very wide tables”. If you find them, then you may have a problem with the Conceptual Model you are using, or possibly t you should be considering using a different tool. In other words, you have a problem with your requirements. You have a “Requirements Smell”.

How many columns make a “God Object” or wide table?

How many columns can you have in an SQL table?

Let’s start with the obvious question: How many columns make a “God Object” or wide table? The maximum number of columns you are allowed to have in a table varies with database manager. For example:

SQL Server may have up to 1024
MySQL can have up to 4096

What the actual numbers are can depend on a lot of technical things. One hundred is still a big number.

Database management software will handle wide tables up to their limits. As with most things, when you approach the limit you will start to encounter difficulties, but that is missing the point. Even 100 columns may indicate a problem.

Why are “God Objects” or wide tables a problem?

Wide tables or "God objects" are a problem which originates in the Conceptual Model

The reasons with “God Objects” or wide tables cause an SQL Smell are technical, practical and what you might term business, or even philosophical problems. I’m a Business Analyst, so I’m going to start from the “Conceptual” end, with the Requirements for the database, and then look at the problems which these tables may cause in Development and then when the system is in operation. Also remember, that if we eliminate problems at the conceptual end, then we’re not going to encounter them further on. Wide tables are most certainly a problem with starts at the “Conceptual Model” stage.

”Conceptual Model” or philosophical problems

Each row in a relational table is supposed to represent something. The “something” may be a concrete object in the real world, or it may be something abstract like a contract or a transaction. Would you be able to explain to the users of your system, or your business owners what a single row represents? If not, you are likely to encounter problems.

Thinking about the columns in this wide table, each column is contains a value. How are you going to present or update those values? 1000 fields would make for a very busy screen. Even some sort of graphical representation is likely to be complex. Do your users really need to see all this data together? While there isn’t a rule which says that the whole of an entity has to be presented on a single screen, or as a single report, it has to represent something. Finally, every column in a row provides one value for one thing at one time. Is that really so in your wide table?

Problems during development

“God objects” or wide tables encourage handling one big lump of data. That in turn is going to encourage the creation of complicated code. Maybe life would be easier for everyone if the data and the process descriptions were much more focused.

If you are in an Analyst role, then think about how you are going to explain what should (and should not) be happening with all these columns.

Remember, SQL tables have no concept of “grouping” of the columns. The columns have an order, but it is not something you should be relying on. If you can form columns into groups, then you should probably consider “normalizing” them into other tables.

Problems in operation

“God objects” or wide tables can cause problems when the system is being used. The volume of data each row contains may cause performance problems when rows are read from the table, when rows are updated and when new rows are created.

Why do we get “God objects”?

Wide tables often start from trying to convert large and complex paper forms or spreadsheets straight into table designs. It seems like a good idea at first, but it can get bogged down in unexpected complexity.

Think about your least favourite paper form, especially if it runs to several pages – maybe it’s a tax return or something similar. Obviously the physical form represents something. If you were specifying a system to work with it, then you would be tempted to have a single table where each row represented a single form, there was a column for every question and each cell contained one person’s answer to a question. It would be just like an enormous spreadsheet. Some early commercial computer systems were like that. They worked but they were inflexible.

One clue that something is going wrong (apart from the number of columns) is the number of columns which need to allow “NULL” values. How many times does “Not Applicable” appear when you are filling in the paper form?

How do we solve the problem of the wide table?

The answer is to think about what all these columns mean and then start applying Data Modelling or normalization techniques to break the data into more manageable and useable chunks. If you can from groups of columns then those groups may be candidate entities and therefore candidate tables.

If you need to use the order of similar columns then maybe you should be considering a different table design like the “Entity Attribute Value” (MVP) Pattern. But beware, because that can give rise to a bad smell too!

Excuses for “God Objects” and wide tables

Nothing in Information Technology is ever clear-cut. There are usually grey areas. One person may regard a table as too wide and another may regard it as OK. There is always room for some discussion. There are times when using a table that is a little wider than we would normally like is acceptable. Here are some of the reasons (or maybe that should be excuses) that you may here for wide tables.

It gets all the work done in one place, so that other programs can use the data. I don’t really buy this one. I suspect that someone is guessing what these other programs need. If the guess is wrong then someone is going to have to re-design the big, wide table. I continue to maintain that having discrete data and performing discrete actions is better.
Here is a specific case I found where someone wanted to retrieve data from 2000 sensors. This is a case where using something other than a relational database might be better in the first instance. Depending on the details it might also be a case where using the Entity Attribute Value (EAV) model is appropriate as well.
We are being given the data in the wide form from another system. This excuse I will accept, because it is really being imposed as an external requirement. But! If you need to do this, then you will need to do the work of working out what all those many columns mean, and you may have to break the wide row down into constituent parts.

Where next?

That’s addressed the “God Object” or “Wide table” smell. I’ve already mentioned the “Entity Attribute Value” (EAV) model a couple of times. I’m going to address why that may be give rise to a bad smell in the next post.

Requirements Smells make SQL Smells!

Your SQL smells because your Requirements smell!

Recently I read an article by Phil Factor on the subject of “SQL Smells”. Phil (apparently not his real name), identifies a number of “smells” which he thinks indicate that a database design or SQL code needs to be reviewed. He classifies some of these as “Problems with Database Design”. I would go further and say some of them are problems with database requirements! In other words, your SQL smells because your Requirements smell!

“Requirements Smells cause SQL smells!”

I no longer claim to be a “Developer” and I have never claimed to be a DBA (Database Administrator), though I have found myself in the position of being an “accidental DBA”. The thought that Requirements could smell bad concerned me.

This realisation made me think about problems with Requirements in general and problems with databases in particular. It is better to avoid a problem rather than cure it, so I’m writing a series of blog posts on how to recognise problems in Requirements and prevent them from becoming “SQL Smells”.

Database design and SQL smells

Any computer system contains a “model” of the world it works with. This model forms the foundations of the system. If the system does not contain a concept, then it cannot work with it!

When people start to create a system they have to decide what concepts their system needs. This is the “Conceptual Model”. This model is transformed through a “Logical Model” until it finally becomes the “Physical Model”, which is the design for the database. The Conceptual and Logical models are not just first-cut versions of the Physical Model, different design decisions and compromises are made at each stage.
This is nothing to do with “Waterfall”, “Agile” or anything to do with any specific development process. In fact, this approach is pretty universal, whether formally or not. Some people combine the different stages, but there are risks to doing that.

A simple way of looking at the Conceptual Model is to say that it is concerned with finding out:

What the business and system need: at the conceptual stage these are known as “Entities”
What we need to know about those things: these are the “Attributes” of the Entities
We also need to document “Business Rules”: some of these will be represented as “Relationships”.

During the design and development process:

Entities will tend to become table definitions
Attributes will become the columns within those tables
Business Rules may become so-called “constraints”.

Different Requirements become affect different aspects of the database

A poor Conceptual Model or bad design decisions can lead to systems which are difficult to build, maintain and use, and which do not perform well either. Once again,

“Requirements Smells will cause SQL Smells”

The idea of “smells” can help us address potential problems earlier and more cheaply.

Where are these “Requirements smells”?

I’m going to group my bad smells in a slightly different way to Phil Factor. I primarily work as a Business Analyst, so I am going to concentrate on “smells” to look for at the Conceptual and Logical Stages of specifying the Requirements for a database, starting with the smell that Phil describes as “The God Object”!

Geek at speed! A ride in a Radical SR8 racing car.

I recently spent a few days in England on business. On my way home, I visited friends and business contacts I don’t see often enough. I thought I would share the achievements of a friend who doesn’t get the recognition he deserves.

This is a Radical SR8. It’s a track-day car or a road-going racing car, and is made in Peterborough. You you won’t see many on the road because they have been successful racing. (Top speed 178 mph)

My friend Steve Prentice designed the engine. The engine is two Suzuki Hayabusa cylinder blocks sharing a common crankshaft. The SR8 is a 2.6 litre V8 sports motorcycle engine in a lightweight chassis with an aerodynamic body! Steve describes the engine as his “best work so far”. The SR8 has held the road-legal lap record at the Nurburgring since 2009. (lap time 6m 48s which, if my calculations are correct, is and average of 112.9 mph!)

I normally avoid showing number-plates and other identifiable things on the internet but I’m making an exception in this case. This particular car is the 2009 specification and was used by Radical as a demonstrator. Steve bought it from Radical a few years ago. He has made a few modifications to the car to make it a little more practical on the road. He uses the car for advertising and for “testing things”.

Front view of SR08RAD Radical SR8 road-going track car.

A visit to Steve’s office, lair of the SR8

Steve lives with his wife near Banbury. His office is rented space in a barn in the middle of a field a short distance from his home. When I arrived at the office Steve was completing an overhaul of the SR8. That’s Steve with the bald head and the overalls!

Steve says the performance of the SR8 owes as much to the chassis and the brakes as to the engine. As you can see the chassis is a space-frame. The brakes are just something else!

I think Steve took pity on me or was fed up with me hanging around. He employed me to fetch and carry tools and to hold spanners. I even provided unskilled help filling it up with oil. With an SR8 even filling it with oil and water is not straightforward. Both lubrication and cooling systems have several potential airlocks. The filling process involves: filling, starting the engine, bleeding, stopping, checking the level and then repeating…

A sedate spin in the countryside

My reward (or punishment?) for being an unskilled labourer cum spectator, was a 40 minute test spin in Steve’s SR8. I needed instructions on how to get into the passenger seat. I also needed help with the 5 point harness.

The SR8 has no windscreen. Before we set off, the pair of us dressed up in black balaclava ski masks and perspex safety specs. Steve said that “we looked like a pair of bank robbers making a get-away”!

The performance of the engine, handling and brakes is everything you would expect. Steve was mostly driving on single carriageway roads. I couldn’t see the instruments, so I have to assume that the maximum speed we reached was 59.5 mph. The acceleration is literally breath-taking. The engine delivers not so much a “kick up the arse” as an “aggressive shove in the small of the back”, and the brakes are equal to it. The handling is excellent too. We hadn’t fitted the floor-pan under the engine bay so that any minor oil leaks would become obvious. As a result Steve said that he could feel a certain lack of down-force. I definitely didn’t notice anything!

I told Steve to pose like a proud father, so here he is. He goes to track days and race meetings, so if you see him with the car, introduce yourself. If you tell him you know me, he may give you a more detailed explanation of how the car is put together.

Steve Prentice, designer of the engine in the SR08 — Steve Prentice, designer of the engine in the SR8

Process Models: Who is doing what?

Every Business Analyst and Business Consultant should know something about Process Models and Process Modelling. Creating a simple model of what is happening outside the IT System can be a very useful place to start. You may even benefit when there is no IT System at all!

Ways of using Process Models

A Process Model is simply a representation of what the process is doing in the real world. This representation is usually graphical. There are different notations, and a large number of tools to help you draw the pictures.

Trivial Process Model - Documented using a Process Modelling tool — Trivial Process Model – Documented using a Process Modelling tool

Process Models can be used in a number of ways, many of which overlap.

A model can be used as a framework to assess the process against some criteria.
Models can be used to explore the effect of some change to the process.
Models can also be used to show how the physical world and IT interact.
Using Process Models appropriately can help ensure that any changes are beneficial to the business.

Process Models: “As-Is” and “To-Be”

Process Models can be used to explore changes to a process. The “As-Is” model shows how the process works now, and the “To-Be” model shows how the process will work after the proposed changes. Comparing the two models allows us to demonstrate how the changes will be beneficial to the business.

The changes need not be changes to IT systems. The benefits which can be demonstrated may be the elimination of roles, or reduction in time or the number of steps.

Process Models: How does IT mesh with the business?

Gears - Process Models show how things interact — Gears – Process Models show how things interact

The Swim-lane process models demonstrate how different roles collaborate or people use several different tools or IT systems to perform their work. Imagine the different roles or tools as “gears” and you will understand what I mean. Using swim-lanes helps you to visualise and communicate how the different lanes interact.

Traps you can avoid using Process Models

Where a business process involves activities in the physical world (and not just doing things at a screen) then a process model can help put the IT systems into context. Doing this may prevent you spending time on details which are not important.

Ways Process Models can trap you!

Many process modelling tools allow you to break-down individual steps into smaller pieces. Resist the temptation to break things down too early, or everywhere. If you keep your models at a high level you will reduce the amount of work you have to do, and you will not reduce the value of the models.

Conclusion

Process Models can be used to put the IT system into a wider business context. They can even be used to analyse and rationalise processes where the IT systems do not play a significant role. Process Models are commonly used to demonstrate the claimed benefits of a new way of doing things, the so-called “As-Is” and “To-Be” models.

The keys to success with Process Models are to present the simplest model which is appropriate for your needs and to control the amount of detail.

Context Diagrams: Putting things in Context

Simple Context Diagram, application of Context Diagrams — Simple Context Diagram

Every Business Analyst should know something about Context Diagrams. I often draw an informal Context Diagram as the one of my first activities when I start a new project. Context Diagrams are good for focussing the mind and reminding you what you don’t know and need to find out.

What is a Context Diagram?

You have almost certainly seen Context Diagrams, even if you haven’t recognised them by name. A Context Diagram is a shape (usually a rectangle or circle) which represents “the system” which is the focus of our interest. This shape is surrounded by other shapes which represent things like:

Users of the system (or actors)
Other systems

The satellite shapes are joined to “the system” by lines. Sometimes the arrows on the lines have real significance, sometimes they are there for decoration.

Context Diagrams were commonly used as the top-most level in decomposition methods (such as SSADM). They are still with us in the form of the Use Case diagram in UML.

What will a Context Diagram tell us?

A Context Diagram will tell us about who uses the system we are looking at. It also tells us about the other systems it interacts with. The diagram actually tells us very little about our system!

When are Context Diagrams useful?

Context Diagrams are useful at towards the start of the project. They are good for communication and especially good for summarising who and what interacts with the system.

Although they don’t define our system at all well, they do make it clear what is outside. As a consequence, they are good for communicating “scope”. I even use them to help define scope during project initiation.

A really good use of Context Diagrams is to emphasis interfaces with other systems.

Limitations of Context Diagrams

There is something seductive about a well-drawn Context diagram. It seems to say a great deal, but actually it doesn’t say a lot.

It is wrong to try and make a Context Diagram do too much. Imagine a diagram with tries to show connections with 100 different objects. It would turn into a mess which nobody could read. As a result, the number of satellites is often edited. That makes the diagram easier to read but removes important information.

As a consequence, Context Diagrams are best used for illustration and communication, rather than definition.

Conclusion

Context Diagrams are a great way of providing overview and “putting things in context”! They are easy to produce and people understand them intuitively. They are good for communicating ideas to a non-technical audience.

To get the best from Context Diagrams you have to recognise their limitations. They are good at describing what is outside “the system” but they say very little about the system itself. They are not very good for detailed definition, and if they contain too much detail they actually become less useful!