2012-12-31

„Ternary boolean logic“ in SQL – making NULL useful

Typically, SQL's NULL is used as „unknown value“ or „Not Applicable“.

For example, you have a table of users who may provide their e-mail, but if they don't, you store NULL to the e-mail column. That's „uknown“.

Other usage example is a column with last user's forum post. But before his first post, you store NULL there to mark there's no post yet.

That's the basic NULL usage. But it can provide you much more – mainly for complex SELECTs used for searching or for reports.

NULL features and performance

On most enterprise-ready RDBMS's, NULL can be indexed.

On some, it also has some special features like MySQL's PRIMARY KEY AUTO_INCREMENT or TIMESTAMP's auto-updating feature.

Storage space for a NULLable column is usually one bit, but these bits are grouped in a byte-aligned field per row, which means, 1 to 8 null columns will occupy 1 byte per row, 9–16 will take 2 bytes, etc.

SELECTing the rows (not) having a NULL value is usually very quick, because this information is part of the index; more, the database system does not have to sort them internally in a tree index, like it has with concrete values (e.g. numeric ID's).

Basic NULL constructs in SQL

  • SELECT a IS NULL
  • SELECT a IS NOT NULL
  • SELECT ISNULL(a)
  • SELECT IFNULL(a, 'default') returns 'default' if a is NULL.
  • SELECT COALESCE(a, b, c) returns the first non-NULL value.

NULL semantics

Null has these characteristics:

Most operations with null result in a NULL.

  • '' + NULLNULL
  • 0 + NULLNULL
  • NOT NULLNULL

Comparison

Comparison always returns NULL, even for NULL = NULL. You have to use IS NULL.

  • '' = NULLNULL
  • 0 = NULLNULL
  • NULL = NULLNULL

There's also a null-safe comparison operator, <=>, which behaves like this:

  • 1 <=> 11
  • NULL <=> NULL1
  • 1 <=> NULL0

There are some NULL-related functions, which provide you a convenient way when you need to work with multiple nullable values:

  • COALESCE()
  • IFNULL()
  • NULLIF()

Boolean logic

Boolean logic is specific. Here, NULL is considered as meaning „Unknown value“.

NULL AND 1NULL NULL AND 00 NULL OR 11 NULL OR 0NULL NULL XOR 1NULL NULL XOR 0 ⇒ NULL

This makes sence, because with AND, you can tell the expression will be false if one of operands is false. However, if one operand is true, it's the other one which determines the result; so if it's unknown, the result is unknown as well.
Similarly it works with OR. For XOR, it's always NULL, as both operands are always needed.

Using NULL features to your advantage

Without learning the basics above, the NULL behavior might seem strange to you and can lead to an unexpected behavior. But if you keep that in mind, you can make your complex SELECTs less complex by using some of the following principles.

To be written later.


0