|
|
IBPhoenix Documentation |
|
Firebird Null GuideBy Paul Vinkenoog 15 April 2005NULL behaviour and pitfalls in Firebird SQLTable of Contents
IntroductionTime and again, support questions pop up on the Firebird mailing lists related to “strange things” happening with NULLs in Firebird SQL. The concept seems difficult to grasp - perhaps partly because of the name, which suggests a “nothing” that won't do any harm if you add it to a number or stick it to the back of a string. In reality, performing such operations will render the entire expression NULL. This article explores the behaviour of NULL in Firebird SQL, points out common pitfalls and shows you how to safely deal with expressions that contain NULL or may resolve to NULL. If all you want is a quick reference to refresh your memory, hop straight to the summary (which is really brief indeed).
Note What is NULL?In SQL, NULL is not a value. It is a state indicating that an item's value is unknown or nonexistent. It is not zero or blank or an “empty string” and it does not behave like any of these values. Few things in SQL lead to more confusion than NULL, and yet its workings shouldn't be hard to understand as long as you stick to the following simple definition: NULL means unknown.
Let me repeat that: NULL in expressionsAs many of us have found out to our chagrin, NULL is contagious: use it in a numerical, string or date/time expression, and the result will always be NULL. Use it in a boolean expression, and the outcome depends on the type of operation and the other values involved. Please note, by the way, that in Firebird versions prior to 2.0 it is mostly illegal to use the constant NULL directly in operations or comparisons. Wherever you see NULL in the expressions below, read it as “a field, variable or other expression that resolves to NULL”. Expressions returning NULL The expressions in this list always return NULL:
If you have difficulty understanding why, remember that NULL means “unknown”. Also have a look at the following table where per-case explanations are provided. In the table we don't write NULL in the expressions (as said, this is often illegal); instead, we use two entities A and B that are both NULL. A and B may be fields, variables, or entire subexpressions in their own right - as long as they're NULL, they'll all behave the same in the enclosing expressions. Table 1. Operations on null entities A and B
NULL in boolean expressions We've already seen that not(NULL) yields NULL. For the and and or operators, things are a bit more complicated:
Firebird SQL doesn't have a boolean data type; nor are true and false existing constants. In the left column of the explanatory table below, (true) and (false) represent subexpressions returning true/false. Table 2. Boolean operations on null entity A
All these results are in accordance with boolean logic. The fact that, in order to compute “X or true” and “X and false”, you simply don't need to know X's value, is also the basis of a feature we know from various programming languages: short-circuit boolean evaluation. More logic (or not) The short-circuit results obtained above may lead you to the following ideas:
How is this implemented in Firebird SQL? Well, I'm sorry I have to inform you that despite this compelling logic - and the analogy with the boolean results discussed above - the following expressions all resolve to NULL:
So much for consistency. NULL in aggregate functions In aggregate functions like COUNT, SUM, AVG, MAX, and MIN, NULL is handled differently: to calculate the outcome, only non-NULL fields are taken into consideration. That is, if you have this table:
MyTable
...the statement select sum(Amount) from MyTable returns 54, which is 37 + 5 + 12. Had all five fields been summed, the result would have been NULL. For AVG, the non-NULL fields are summed and the sum divided by the number of non-NULL fields. There is one exception to this rule: COUNT(*) returns the count of all rows, even rows whose fields are all NULL. But COUNT(FieldName) behaves like the other aggregate functions in that it only counts rows where the specified field is not NULL. Another thing worth knowing is that COUNT(*) and COUNT(FieldName) never return NULL: if there are no rows in the set, both functions return 0. Also, COUNT(FieldName) returns 0 if all FieldName fields in the set are NULL. The other aggregate functions return NULL in such cases. Be warned that even SUM returns NULL if used on an empty set, which is contrary to common logic. NULL handling in UDFsUDFs (User Defined Functions) are functions that are not internal to the engine, but defined in separate modules. Firebird ships with two UDF libraries: ib_udf (inherited from InterBase) and fbudf. You can add additional libraries, e.g. by buying or downloading them, or by writing them yourself. UDFs can't be used out of the box; they have to be “declared” to the database first. This is also true for the UDFs that come with Firebird itself. NULL <-> non-NULL conversions you didn't ask for Teaching you how to declare, use, and write UDFs is outside the scope of this guide. However, we must warn you that UDFs can occasionally perform unexpected NULL conversions. This will sometimes result in NULL input being converted to a regular value, and other times in the nullification of valid input like '' (an empty string). The main cause of this problem is that with “old style” UDF calling, it is not possible to pass NULL as input to the function. When a UDF like LTRIM (left trim) is called with a NULL argument, the argument is passed to the function as an empty string. From inside the function there is no way of telling if this argument represents a real empty string or a NULL. So what does the function implementor do? He has to make a choice: either take the argument at face value, or assume it was originally a NULL and treat it accordingly. Depending on the result type, returning NULL may be possible even if receiving NULL isn't. Thus, the following unexpected things can happen:
Both conversions are usually unwanted, but the second probably more so than the first (better validate something NULL than wreck something valid). To get back to our LTRIM example: up to and including Firebird 1.0.3, this function returned NULL if you fed it an emptystring; as from version 1.5, it never returns NULL. In these recent versions, NULL strings are “trimmed” to emptystrings - which is wrong, but is considered the lesser of two evils: in the old situation, valid (empty) strings were mercilessly reduced to NULLs. Being prepared for undesired conversions The unsolicited conversions described above normally only happen with legacy UDFs, but there are a lot of them around (most notably in ib_udf). Also, nothing will stop a careless implementor from doing the same in a new-style function. So the bottom line is: if you use a UDF and you don't know how it behaves with respect to NULL:
The declarations for the shipped UDF libraries can be found in the Firebird subdirectory bin/examples (v. 1.0) or bin/UDF (v. 1.5 and up). Look at the files with extension .sql More on UDFs To learn more about UDFs, consult the InterBase 6.0 Developer's Guide (free at http://www.ibphoenix.com/downloads/60DevGuide.zip), Using Firebird and the Firebird Reference Guide (both on CD), or the Firebird Book. CD and book can be purchased here. NULL in if statementsIf the test expression of an if statement resolves to NULL, the then clause is skipped and the else clause (if present) executed. But beware! The expression may behave like false in this case, but it doesn't have the value false. It's still NULL, and weird things may happen if you forget that. The following examples explore some of the fiendish workings of NULL in if statements:
If a and b are both NULL, MyVariable will yet be “Not equal” after executing this code. The reason is that the expression “a = b” yields NULL if at least one of them is NULL. With the test expression NULL, the then block is skipped, and the else block executed.
Here, MyVariable will be “Equal” if a is NULL and b isn't, or vice versa. The explanation is analogous to that of the previous example.
This one looks like it should give the same results as the previous example, doesn't it? After all, we've inverted the test expression and swapped the then and else clauses. And indeed, as long as neither variable is NULL, both code fragments are equivalent. But as soon as a or b becomes NULL, so does the entire test expression, the else clause is executed, and the result is “Not equal”.
Note Testing if something is NULLIn light of the havoc that NULL can wreak, you will often want to know whether something is NULL before you use it in an expression. To some, the obvious test would seem to be
and indeed there are database management systems that support this syntax to determine nullness. But the SQL standard doesn't allow it, and neither does Firebird. In versions prior to 2.0 this entire syntax is even illegal. From 2.0 onward it is permitted, but the comparison always returns NULL, regardless of A's state and value. It is therefore worthless as a nullness test - what we need is a clear true or false result. The correct way to test for NULL is:
These tests always return true or false - no messing around. Examples:
You could say that whereas “=” (when used as an equality operator) can only compare values, “is” tests for a state. Setting a field or variable to NULLFields and variables can be set to NULL using the same syntax as for regular values:
- “Wait a minute... and you said that MyField = NULL was illegal!” That's right... for the comparison operator “=” (at least in pre-2.0 versions of Firebird). But here we are talking about “=” as an assignment operator. Unfortunately, both operators have the same symbol in SQL. In assignments, whether done with “=” or with an insert list, you can treat NULL just like any value - no special syntax needed (or indeed possible). Dealing with NULLsThis section contains some practical tips and examples that may be of use to you in your everyday dealings with NULLs. Testing for NULL - if it matters Quite often, you don't need to take special measures for fields or variables that may be NULL. For instance, if you do this:
you probably don't want to see the customers whose town is unspecified. Likewise:
doesn't include people of unknown age, which is also defendable. But:
seems less right: if you don't know a person's age, you shouldn't explicitly deny her the right to vote. Worse, this:
won't have the same effect as the previous. If some of the NULL ages are in reality under 18, you're now letting minors vote! The right approach here is to test for NULL explicitly:
Note Finding out if fields are the same Sometimes you want to find out if two fields or variables are the same and you want to consider them equal if they are both NULL. The correct test for this is:
or, if you want to make the precedence of the operations explicit:
A word of warning though: if exactly one of A and B is NULL, the test expression becomes NULL, not false! This is OK in an if statement, and we can even add an else clause which will be executed if A and B are not equal (including when one is NULL and the other isn't):
But you shouldn't get the bright idea of inverting the expression and using it as an inequality test (like I once did):
The above code will work correctly if A and B are both NULL or both non-NULL. But it will fail to execute the then clause if exactly one of them is NULL. If you only want something to be done if A and B are different, either use one of the correct expressions shown above and put a dummy statement in the then clause, or use this longer test expression:
Finding out if a field has changed In triggers, it's often useful to know if a certain field has changed (including: gone from NULL to non-NULL or vice versa) or stayed the same. This is nothing but a special case of testing the (in)equality of two fields. Just use New.Fieldname and Old.Fieldname for A and B:
Substituting NULL with a value The COALESCE function Firebird 1.5 has a function that can convert NULL to most anything else. This enables you to do an on-the-fly conversion and use the result in your further processing, without the need for “if (MyExpression is null) then” constructs. The function is called COALESCE and you call it like this:
COALESCE returns the first non-NULL expression in the argument list. If all the expressions are NULL, it returns NULL. This is how you would use COALESCE to construct a person's full name from the first, middle and last names, assuming that some middle name fields may be NULL:
Or, to create an as-informal-as-possible name from a table that also includes nicknames, and assuming that both nickname and first name may be NULL:
COALESCE will only help you out in situations where NULL can be treated in the same way as some allowed value for the datatype. If NULL needs special handling, like in the “right to vote” example used previously, your only option is to go for “if (MyExpression is null) then” after all. Firebird 1.0: the *NVL functions Firebird 1.0 doesn't have COALESCE. However, you can use four UDFs that provide a good part of its functionality. These UDFs reside in the fbudf lib and they are:
The *NVL functions take exactly two arguments. Like COALESCE, they return the first argument if it's not NULL; otherwise, they return the second. Please note that the Firebird 1.0 fbudf lib - and therefore, the *NVL functions - is only available for Windows. SummaryNULL in a nutshell:
Remember, this is how NULL works in Firebird SQL. There may be (subtle) differences with other RDBMSes. License noticeThe contents of this Documentation are subject to the Public Documentation License Version 1.0 (the “License”); you may only use this Documentation if you comply with the terms of this License. Copies of the License are available at http://www.firebirdsql.org/pdfmanual/pdl.pdf (PDF) and http://www.firebirdsql.org/manual/pdl.html (HTML). The Original Documentation is titled Firebird Null Guide. The Initial Writer of the Original Documentation is: Paul Vinkenoog. Copyright (C) 2005. All Rights Reserved. Initial Writer contact: paulvink at users dot sourceforge dot net. |