Phil Factor says “using LIKE in a WHERE clause with an initial wildcard character” is an SQL Smell, but why? Why does the position of the wildcard make a difference?
This really is a problem, and it gets more serious with larger tables.
I’m going to explain what the problem is from an analyst’s point of view.
What LIKE does for us
LIKE identifies rows which match the criteria. It can be used with character columns and combined with other conditions.
The first time I encountered this statement I was amazed that it worked at all! This simply demonstrates how SQL protects us from what is really happening underneath the surface.
Appearances can be deceptive. LIKE only appears to match the search argument against the contents of the table column. It has to use an index in order to perform properly. The entries in the index are sorted. SQL uses the sort sequence with “trailing” wildcards, to get to the first match quickly, and then ignore any entries past the last match.
The problem with leading wildcards
SQL cannot use an index to matching entries when there is a leading wildcard. As a consequence it will probably scan the whole table. This will work but it may take longer than you want.
The problem gets worse for tables with more rows because with a leading wildcard LIKE has to test every row.
There will be times when you need to use a leading wildcard it in a query. Go ahead and use it, just expect poor performance. If this query which will be used repeatedly, or in a program, then discuss what you are trying to do with the people who look after your database.
Summary
Phil Factor describes leading wildcards with LIKE as an SQL Smell. This may be unavoidable in some ad-hoc queries, but you should treat it with caution if you intend to reuse the query or use it with large tables.
Where next?
The next, and final, blog post is on “Floating Point Numbers”. Do you use floating point numbers in your database? Phil Factor thinks this is an SQL Smell too. Find out why in the next article.