Problems with “God Objects” or very wide tables

Wide Tables, Wide SQL Tables
Wide SQL Tables

One of the “SQL smells” Phil Factor identifies in his article is the presence of “God Objects” in your Database or design. I agree with him, except that I would call them “very wide tables”. If you find them, then you may have a problem with the Conceptual Model you are using, or possibly t you should be considering using a different tool. In other words, you have a problem with your requirements. You have a “Requirements Smell”.

How many columns make a “God Object” or wide table?

How many columns can you have in an SQL table?
How many columns can you have in an SQL table?

Let’s start with the obvious question: How many columns make a “God Object” or wide table? The maximum number of columns you are allowed to have in a table varies with database manager. For example:

What the actual numbers are can depend on a lot of technical things. One hundred is still a big number.

Database management software will handle wide tables up to their limits. As with most things, when you approach the limit you will start to encounter difficulties, but that is missing the point. Even 100 columns may indicate a problem.

Why are “God Objects” or wide tables a problem?

Wide tables or "God objects" are a problem which originates in the Conceptual Model

The reasons with “God Objects” or wide tables cause an SQL Smell are technical, practical and what you might term business, or even philosophical problems. I’m a Business Analyst, so I’m going to start from the “Conceptual” end, with the Requirements for the database, and then look at the problems which these tables may cause in Development and then when the system is in operation. Also remember, that if we eliminate problems at the conceptual end, then we’re not going to encounter them further on. Wide tables are most certainly a problem with starts at the “Conceptual Model” stage.

”Conceptual Model” or philosophical problems

Each row in a relational table is supposed to represent something. The “something” may be a concrete object in the real world, or it may be something abstract like a contract or a transaction. Would you be able to explain to the users of your system, or your business owners what a single row represents? If not, you are likely to encounter problems.

Thinking about the columns in this wide table, each column is contains a value. How are you going to present or update those values? 1000 fields would make for a very busy screen. Even some sort of graphical representation is likely to be complex. Do your users really need to see all this data together? While there isn’t a rule which says that the whole of an entity has to be presented on a single screen, or as a single report, it has to represent something. Finally, every column in a row provides one value for one thing at one time. Is that really so in your wide table?

Problems during development

“God objects” or wide tables encourage handling one big lump of data. That in turn is going to encourage the creation of complicated code. Maybe life would be easier for everyone if the data and the process descriptions were much more focused.

If you are in an Analyst role, then think about how you are going to explain what should (and should not) be happening with all these columns.

Remember, SQL tables have no concept of “grouping” of the columns. The columns have an order, but it is not something you should be relying on. If you can form columns into groups, then you should probably consider “normalizing” them into other tables.

Problems in operation

“God objects” or wide tables can cause problems when the system is being used. The volume of data each row contains may cause performance problems when rows are read from the table, when rows are updated and when new rows are created.

Why do we get “God objects”?

Wide tables often start from trying to convert large and complex paper forms or spreadsheets straight into table designs. It seems like a good idea at first, but it can get bogged down in unexpected complexity.

Think about your least favourite paper form, especially if it runs to several pages – maybe it’s a tax return or something similar. Obviously the physical form represents something. If you were specifying a system to work with it, then you would be tempted to have a single table where each row represented a single form, there was a column for every question and each cell contained one person’s answer to a question. It would be just like an enormous spreadsheet. Some early commercial computer systems were like that. They worked but they were inflexible.

One clue that something is going wrong (apart from the number of columns) is the number of columns which need to allow “NULL” values. How many times does “Not Applicable” appear when you are filling in the paper form?

How do we solve the problem of the wide table?

The answer is to think about what all these columns mean and then start applying Data Modelling or normalization techniques to break the data into more manageable and useable chunks. If you can from groups of columns then those groups may be candidate entities and therefore candidate tables.

If you need to use the order of similar columns then maybe you should be considering a different table design like the “Entity Attribute Value” (MVP) Pattern. But beware, because that can give rise to a bad smell too!

Excuses for “God Objects” and wide tables

Nothing in Information Technology is ever clear-cut. There are usually grey areas. One person may regard a table as too wide and another may regard it as OK. There is always room for some discussion. There are times when using a table that is a little wider than we would normally like is acceptable. Here are some of the reasons (or maybe that should be excuses) that you may here for wide tables.

  • It gets all the work done in one place, so that other programs can use the data. I don’t really buy this one. I suspect that someone is guessing what these other programs need. If the guess is wrong then someone is going to have to re-design the big, wide table. I continue to maintain that having discrete data and performing discrete actions is better.
  • Here is a specific case I found where someone wanted to retrieve data from 2000 sensors. This is a case where using something other than a relational database might be better in the first instance. Depending on the details it might also be a case where using the Entity Attribute Value (EAV) model is appropriate as well.
  • We are being given the data in the wide form from another system. This excuse I will accept, because it is really being imposed as an external requirement. But! If you need to do this, then you will need to do the work of working out what all those many columns mean, and you may have to break the wide row down into constituent parts.

Where next?

That’s addressed the “God Object” or “Wide table” smell. I’ve already mentioned the “Entity Attribute Value” (EAV) model a couple of times. I’m going to address why that may be give rise to a bad smell in the next post.

Requirements Smells make SQL Smells!

 

Your SQL smells because your Requirements smell!
Your SQL smells because your Requirements smell!

Recently I read an article by Phil Factor on the subject of “SQL Smells”. Phil (apparently not his real name), identifies a number of “smells” which he thinks indicate that a database design or SQL code needs to be reviewed. He classifies some of these as “Problems with Database Design”. I would go further and say some of them are problems with database requirements! In other words, your SQL smells because your Requirements smell!

“Requirements Smells cause SQL smells!”

I no longer claim to be a “Developer” and I have never claimed to be a DBA (Database Administrator), though I have found myself in the position of being an “accidental DBA”. The thought that Requirements could smell bad concerned me.

This realisation made me think about problems with Requirements in general and problems with databases in particular. It is better to avoid a problem rather than cure it, so I’m writing a series of blog posts on how to recognise problems in Requirements and prevent them from becoming “SQL Smells”.

Database design and SQL smells

Any computer system contains a “model” of the world it works with. This model forms the foundations of the system. If the system does not contain a concept, then it cannot work with it!

A simplified database design process
A simplified database design process

When people start to create a system they have to decide what concepts their system needs. This is the “Conceptual Model”. This model is transformed through a “Logical Model” until it finally becomes the “Physical Model”, which is the design for the database. The Conceptual and Logical models are not just first-cut versions of the Physical Model, different design decisions and compromises are made at each stage.
This is nothing to do with “Waterfall”, “Agile” or anything to do with any specific development process. In fact, this approach is pretty universal, whether formally or not. Some people combine the different stages, but there are risks to doing that.

A simple way of looking at the Conceptual Model is to say that it is concerned with finding out:

  • What the business and system need: at the conceptual stage these are known as “Entities”
  • What we need to know about those things: these are the “Attributes” of the Entities
  • We also need to document “Business Rules”: some of these will be represented as “Relationships”.

During the design and development process:

  • Entities will tend to become table definitions
  • Attributes will become the columns within those tables
  • Business Rules may become so-called “constraints”.
Different Requirements become affect different aspects of the database
Different Requirements become affect different aspects of the database

A poor Conceptual Model or bad design decisions can lead to systems which are difficult to build, maintain and use, and which do not perform well either. Once again,

“Requirements Smells will cause SQL Smells”

The idea of “smells” can help us address potential problems earlier and more cheaply.

Where are these “Requirements smells”?

I’m going to group my bad smells in a slightly different way to Phil Factor. I primarily work as a Business Analyst, so I am going to concentrate on “smells” to look for at the Conceptual and Logical Stages of specifying the Requirements for a database, starting with the smell that Phil describes as “The God Object”!

Learn to read SQL Databases

Do you know how to read SQL databases? By “read” I mean really get the most out of the database. If for some reason you were presented with an unfamiliar database containing a lot of tables, would you know what to do? If not, then you are not alone.

Many people: programmers, developers and analysts are taught the basics of SQL, they know how to write a SELECT query and can even use JOIN to write queries involving more than one table, but relatively few are shown how to read SQL and interpret the database structure itself. This is a pity, because the structure of a database is often very well documented. What is more, that documentation is incorporated into the database itself. That means that it is available and it is up to date. The problem is, most people don’t look at the database in that way. They concentrate on the content, rather than the structure. When they want to know which tables to look at, they tend to ask the local “expert”.

Asking the expert is a good thing to do, but it has a number of problems. The first and most serious problem is “how expert is the expert?” and the second is the problem that the expert may be busy.

I was prompted to think about this problem a little while ago when someone asked about “How to ‘dig in’ to a large database” on one of the forums I visit. I gave an answer which several people found helpful and which I documented here in my blog.

Over the intervening period I have refined the method I described and reduced it to something I call “DOGI”.

The DOGI method
The DOGI method
  • D = Diagram
  • O = Organise
  • G = Group
  • I = Inside

I like simple acronyms. They make things memorable and aid learning.

The approach I suggest to read SQL is really quite simple. It starts by getting an overview of the database by using whatever tool you have and then organising the diagram in order to make it easier to understand. This “organised” diagram makes it much easier to recognise groups of related tables which can then be investigated in more depth. This approach makes the investigation process more systematic and easier to plan.

I have created a course which teaches this method. I’ve structured it as what I term “an extended tutorial”. I start with one of Microsoft’s example databases and then use the method to investigate it. I surprised myself with how much information I was able to glean.

This course is suitable for anyone how can write a simple SELECT statement. By simple I mean “SELECT * FROM TableName”. When you have completed the course, you will have seen the DOGI method applied to one of Microsoft’s databases and will be able to repeat this yourself. More significantly, when you have completed this course you be able to use the DOGI method to “read SQL databases” and make yourself “the expert”. You won’t have to ask which tables you need to look at, because you will know. You will know what the tables are doing and you will be able to relate them to what the application can and cannot do.

I teach using a mixture of lectures and demonstrations. I think you will learn best if you repeat what demonstrate for yourself, but that is entirely optional. With each step you see the DOGI method applied in practice and build your knowledge using what you have learned already. On-line courses set you free to work at your own pace and to review and revisit material, even after you have completed the course.

If I’ve got your interest, then I’ve included links so you can purchase the course at a substantial discount. Go on, have a look now! It’s all supported by a 30 day, no-questions-asked, money-back guarantee too.

Read SQL like an Expert - 50% Off
SQL: Read a Database like an Expert – 50% Off

Enrol in Read SQL like and Expert

50% Off! $10

Why bother learning SQL?

Why would anyone want to learn a 35 year old programming language? Especially if the language we are talking about was originally intended to be written in ugly CAPITAL LETTERS and moreover tries to make you think in a funny way?

If the language in question is SQL (Structured Query Language), then I think almost anyone who is involved in IT systems should at least consider investing a little time gaining a basic understanding, because SQL databases are at the heart of many commercial systems.

Understanding even a little about what SQL is and how it works and the terminology it uses will enable you to communicate much more effectively with the other members of the development team. Even a little knowledge will give you an appreciation of what SQL databases are good for, what their limitations are and how they compare with other technologies. It doesn’t really matter, whether you start from a programming background or not, because at its heart SQL is really rather simple.

A little while ago, I wrote a course which teaches these basics. It teaches the principles of Relational Databases and the SQL language. Although it is intended for “Analysts”, what it teaches will be useful to programmers or those who are just interested in how systems work.

This course is suitable for a novice. I start by explaining how Relational Databases relate to your experience in the everyday world. When you have completed the course and exercises you will have created a database containing several related tables, updated the data they contain and written many queries which extract and summarise that data.

I teach using a mixture of lectures, demonstrations and exercises. You will learn by seeing, hearing and doing. With each step you build your knowledge using what you have learned already. An on-line course sets you free to work at your own pace and to review and revisit earlier material, even after you have completed the course.
If I’ve got your interest, then I’ve included links so you can purchase the course at a substantial discount. Go on, have a look now! It’s all supported by an 30 day, no-questions-asked, money-back guarantee too.

SQL and Database for Analysts - 75% Off
SQL and Database for Analysts – 75% Off

EnrolButton

75% Off! $12