Do “Entity Attribute Value” (EAV) Models smell bad?

Does the Entity Attribute Value (EAV) model an SQL smell?
Does the Entity Attribute Value (EAV) model have an SQL smell?

Have you heard of the “Entity Attribute Value” (EAV) Model or pattern? You may have, even if you don’t recognise the name.

The Entity Attribute Value pattern allows someone to add extra attributes to some entities. You know for sure that you have the EAV if you have entities or tables called:

  • …attribute
  • …attribute_value

You can substitute words like: property or feature instead of attribute. That’s right – you have an entity called attribute!

Phil Factor identifies the EAV as a potential “SQL Smell” in this article . I regard the Entity Attribute Value model as a “Requirements Smell” too. There are legitimate uses for EAV, but having an EAV may also indicate a problem. The problem my be with your Conceptual Model, or with the way it has been turned into a Logical Database Design.

My previous article dealt with tables which were very wide (have lots of columns). The EAV has the potential to produce a table (the “attribute_value” table) which is very narrow (typically only 3 or 4 columns) and is very long (with lots of rows).

Good Reasons to use an Entity Attribute Value (EAV) Model

The Entity Attribute Value model is not necessarily wrong. There are legitimate reasons for having an EAV:

  • Conscious modelling of data abstractions: EAVs can be used to take your requirements to a more abstract level. They are commonly found deep inside of “modelling tools”, CASE tools and other software packages which are intended to be configurable.
  • Consciously making the Conceptual Model “Extensible”: Allowing “user-defined” (or administrator defined) attributes for things is another legitimate use.
  • Anticipating very sparse data: EAV is a way of handling lots of NULLable columns. It uses space efficiently, but at the cost of more complex processing.

Bad Reasons to use an Entity Attribute Value (EAV) Model

  • It’s “cool”! I confess. I have done this. When I first discovered the EAV model, I tried to apply it everywhere. This is not a good idea.
  • Laziness is a very bad reason for using the EAV model. The argument goes something like: “we’re not sure what attributes the users need, so we will allow them to define their own”. The problem with this approach is that these “user-defined” attributes are hard to validate and process design becomes significantly harder.

Benefits of using an Entity Attribute Value (EAV) Model

  • Creates opportunities for re-use: Using the EAV model can create opportunities for reusing code. All those user-defined attributes are maintained by the same code. It can work very well with abstract Object-Oriented design.
  • Can make for very elegant and compact code: Well thought-out EAV code can be compact and elegant. This is one reason why you will find EAV models inside many packages.
  • Can make the data very compact: Using an EAV model can reduce the number of tables you need and the space the data takes up.

Disadvantages of using an Entity Attribute Value (EAV) Model

  • Data validation becomes harder: The value on the “Attribute_Value” table tends to be stored in a data-type like “varchar”. This makes validation of the data harder. Of course you can start to add validation yourself, but this is adding complexity.
  • Code becomes abstract and hard to understand: Code written to use an EAV always has to go through extra steps compared to having the column you want directly on the table.
  • Data becomes abstract and hard to understand. One solution to this is to add SQL views or a layer of code which transforms the abstract data in the EAV into something closer to what the business users are expecting.
  • The application may need “seed data” which is almost part of the code. This is what happens in some packages.
  • The application may require a complex “configuration” process. Again, this is what you find in some packages. You have to select which values
  • Performance: EAV requires 2 joins to get a value and the attribute name. This has performance implications.

Where the complexity and performance impact come from

How "user-defined" values are stored in the Entity Attribute Value (EAV) model
How “user-defined” values are stored in the Entity Attribute Value (EAV) model

We can imagine a simple EAV model where the “Entity” contains a single attribute called “Name”. Retrieving the values of “Name” is straightforward. Using the EAV model we can create user-defined attributes called: “Type, Colour, Length and Width”. We can record values for these attributes for any row in the entity table. It is hard to validate the data.

We can retrieve the value of a user-defined attributes using a JOIN. To get the “name” of attribute will require a second join. This can get messy!

SELECT will follow a different path for "user-defined" attributes in the EAV model
SELECT will follow a different path for “user-defined” attributes in the EAV model

Using the values in the EAV tables to identify rows in the Entity table is possible. This may present a challenge to the designers.

Summary – Does the Entity Attribute Value (EAV) Model smell bad?

I agree with Phil Factor. Think hard about whether you should use the EAV model. Using EAV inappropriately can have bad effects.

The Entity Attribute Value model may indicate the requirements are not understood. That is always a bad thing.

Where next?

In the next article I’m going to look at another “SQL Smell”. Phil Factor calls this one “Polymorphic Association”.

 

Requirements Smells make SQL Smells!

 

Your SQL smells because your Requirements smell!
Your SQL smells because your Requirements smell!

Recently I read an article by Phil Factor on the subject of “SQL Smells”. Phil (apparently not his real name), identifies a number of “smells” which he thinks indicate that a database design or SQL code needs to be reviewed. He classifies some of these as “Problems with Database Design”. I would go further and say some of them are problems with database requirements! In other words, your SQL smells because your Requirements smell!

“Requirements Smells cause SQL smells!”

I no longer claim to be a “Developer” and I have never claimed to be a DBA (Database Administrator), though I have found myself in the position of being an “accidental DBA”. The thought that Requirements could smell bad concerned me.

This realisation made me think about problems with Requirements in general and problems with databases in particular. It is better to avoid a problem rather than cure it, so I’m writing a series of blog posts on how to recognise problems in Requirements and prevent them from becoming “SQL Smells”.

Database design and SQL smells

Any computer system contains a “model” of the world it works with. This model forms the foundations of the system. If the system does not contain a concept, then it cannot work with it!

A simplified database design process
A simplified database design process

When people start to create a system they have to decide what concepts their system needs. This is the “Conceptual Model”. This model is transformed through a “Logical Model” until it finally becomes the “Physical Model”, which is the design for the database. The Conceptual and Logical models are not just first-cut versions of the Physical Model, different design decisions and compromises are made at each stage.
This is nothing to do with “Waterfall”, “Agile” or anything to do with any specific development process. In fact, this approach is pretty universal, whether formally or not. Some people combine the different stages, but there are risks to doing that.

A simple way of looking at the Conceptual Model is to say that it is concerned with finding out:

  • What the business and system need: at the conceptual stage these are known as “Entities”
  • What we need to know about those things: these are the “Attributes” of the Entities
  • We also need to document “Business Rules”: some of these will be represented as “Relationships”.

During the design and development process:

  • Entities will tend to become table definitions
  • Attributes will become the columns within those tables
  • Business Rules may become so-called “constraints”.
Different Requirements become affect different aspects of the database
Different Requirements become affect different aspects of the database

A poor Conceptual Model or bad design decisions can lead to systems which are difficult to build, maintain and use, and which do not perform well either. Once again,

“Requirements Smells will cause SQL Smells”

The idea of “smells” can help us address potential problems earlier and more cheaply.

Where are these “Requirements smells”?

I’m going to group my bad smells in a slightly different way to Phil Factor. I primarily work as a Business Analyst, so I am going to concentrate on “smells” to look for at the Conceptual and Logical Stages of specifying the Requirements for a database, starting with the smell that Phil describes as “The God Object”!

Context Diagrams: Putting things in Context

Context Diagrams: Putting things in Context

Simple Context Diagram, application of Context Diagrams
Simple Context Diagram

Every Business Analyst should know something about Context Diagrams. I often draw an informal Context Diagram as the one of my first activities when I start a new project. Context Diagrams are good for focussing the mind and reminding you what you don’t know and need to find out.

What is a Context Diagram?

You have almost certainly seen Context Diagrams, even if you haven’t recognised them by name. A Context Diagram is a shape (usually a rectangle or circle) which represents “the system” which is the focus of our interest. This shape is surrounded by other shapes which represent things like:

  • Users of the system (or actors)
  • Other systems

The satellite shapes are joined to “the system” by lines. Sometimes the arrows on the lines have real significance, sometimes they are there for decoration.

Context Diagrams were commonly used as the top-most level in decomposition methods (such as SSADM). They are still with us in the form of the Use Case diagram in UML.

What will a Context Diagram tell us?

A Context Diagram will tell us about who uses the system we are looking at. It also tells us about the other systems it interacts with. The diagram actually tells us very little about our system!

When are Context Diagrams useful?

Context Diagrams are useful at towards the start of the project. They are good for communication and especially good for summarising who and what interacts with the system.

Although they don’t define our system at all well, they do make it clear what is outside. As a consequence, they are good for communicating “scope”. I even use them to help define scope during project initiation.

A really good use of Context Diagrams is to emphasis interfaces with other systems.

Limitations of Context Diagrams

There is something seductive about a well-drawn Context diagram. It seems to say a great deal, but actually it doesn’t say a lot.

It is wrong to try and make a Context Diagram do too much. Imagine a diagram with tries to show connections with 100 different objects. It would turn into a mess which nobody could read. As a result, the number of satellites is often edited. That makes the diagram easier to read but removes important information.

As a consequence, Context Diagrams are best used for illustration and communication, rather than definition.

Conclusion

Context Diagrams are a great way of providing overview and “putting things in context”! They are easy to produce and people understand them intuitively. They are good for communicating ideas to a non-technical audience.

To get the best from Context Diagrams you have to recognise their limitations. They are good at describing what is outside “the system” but they say very little about the system itself. They are not very good for detailed definition, and if they contain too much detail they actually become less useful!

Business Analysis: Pictures versus Words

Business Analysis: Pictures versus Words?

Image expressing "Pictures versus Words". An image of a pipe and the text "This is a pipe"
Pictures versus Words: This is a pipe

Business Analysts use models which are presented as pictures. Business Analysts also need to write text. This may give rise to a debate:

“Pictures versus Words – which is better?”

(In the picture I’m making a playful reference to Rene Magritte’s famous painting “The Treachery of Images”). Unfortunately, the honest answer to this question has to be “it depends”. It depends on what:

  • You trying to achieve with your model or text?
  • The business side wants and understands?
  • IT side needs and understands?
  • You comfortable with?

Inevitably the answer is going to finish up as a compromise.

The limitations of words

Words used as a bridge between Business and Information Technology
Words alone may form a weak bridge

Words are marvellous! You are reading this blog post! Informally, most projects start off with a short statement “we want to do something”. We add detail to that statement and it becomes a list of “requirements”. That is good, but as we start to describe our “pipe”, sometimes we find that “a picture (or model) really is worth a thousand words”.

In the “Pictures versus Words” debate, the strength of words is found when they are brief and used to define the details. They are less good for explaining how things relate to one-another.

Words alone may make for a weak bridge between business and IT. And beware! Lawyers have been arguing about the precise meanings of words since time immemorial!

The power of models (or pictures)

Orthographic projection sketch of a chair
Orthographic projection sketch of a chair

“Orthographic Projection” is one standard way of arranging different views of the same object. You know very quickly that the diagram describes a type of chair.

Picture models like engineering drawings are strong when they are used to provide overview and show how things relate to one another. By using different views of the same thing we can check each one against the others and detect inconsistencies at an early stage.

Conclusion

In summary, pictures and models are useful and so is text. Use both as appropriate. Add detail as you need it, but limit the amount of detail to what you need. Choose the models you use with care so that you can use them to validate the requirements. My advice is to apply Ward Cunningham’s comment “The simplest thing that could possibly work” to documentation and requirements.

Are you wearing a Business Analysis Hat?

Are you wearing a Business Analysis Hat?

Business Analysis Hat
Business Analysis Hat

Are you wearing a “Business Analysis Hat”? You may still find yourself performing Business Analyst activities even when it is not your job title.

For a great deal of my career my job title has been “Something… Analyst”.  I have found myself performing analyst activities even when “Analyst” has not been mentioned in the job description. Lots of people tell me they have had similar experiences.

Rigid demarcation or flexible roles?

Some organisations have rigid demarcation between roles. In these organisations just reading someone’s job title tells you what they do: Business Analyst, Project Manager, Developer etc.

In smaller, less structured organisations, everybody “mucks in” to do the work.

Even in the more rigid situation, Analysis is an activity overlaps with others. Analysis clarifies what needs to be done. Business Analysis is not so much a question of who is doing it, as what they are doing and what tools they are using. Anyone can find themselves wearing a Business Analysis Hat!

What kind of Business Analysis?

One of the classic metaphors used for Business Analysis is a bridge connecting “Business” and “Information Technology”.  Business Analysts use metaphor a lot. We spend a lot of time saying:

  • “This is a bit like that”, or
  • “This represents that”

When we do this we are trying to “facilitate” communication between the two sides of the bridge and helping each side understand what effect they are having on the other side. “Facilitate” is one of the buzz-words you may hear a lot!

Business Analysis Bridge
Business Analysis Bridge

This facilitation role has several aspects, which are not mutually exclusive. Some of the activities I have seen are:

  • Managing “Requirements” – Put what “The Business” asks for into a formal structure. These “Requirements” are used by “Information Technology”. “The Business” understands the solution in terms of these requirements.
  • Encapsulating IT – Summarise the complexity of “Information Technology” so “The Business” can make informed decisions.
  • Working with Numbers – Use IT tools to understand aspects of “The Business” and present those insights to the business.

What kind of tools will you use?

Analysts use “soft skills” and “documentation skills” but they also make extensive use of “models”. You will hear mention of:

  • Process Models
  • Data Models
  • Use Case Models
  • and several more.

Understanding what the different types of model can give you and choosing the appropriate model for you situation can make communication more effective. I plan to explain some of the different models and notations in later blog posts.

Conclusion

Business Analysis facilitates communication between The Business and Information Technology. Analysis clarifies what needs to be done. Almost anyone can find themselves “wearing the Business Analysis hat”!

Communication is not only in one direction. Analysis uses many kinds of model and using the model which is appropriate for the situation can make communication more effective.