by Bjørge Sæther
About the sample code
Note
From Firebird 2.0 declare the RDB$DB_KEY variable in a stored procedure as CHARACTER SET OCTETS.
The examples below are the results of months of efforts trying to perform a large financial analysis on a large database (> 4Gb) within reasonable timings. The real breakthrough came with the introduction of the RDB$DB_KEY. In addition to enormous speed increase, it minimized the need for indexing, thus making both development and implementation both safer and easier.
It should be emphasized, though, that you should make comparisons from time to time. After "having seen the light", you may easily overlook simpler (and faster) possible statements.
The samples could of course be rewritten in a more tutorial fashion. This takes quite some time, and I also wanted to show a few real-world examples (as opposed to the "CUSTOMER - like" examples).
And remember: These procedures are being run on tables having up to 6 million rows. The "success stories" of some of them is like reducing execution time from 1 hr 30 min down to 7 min when introducing the db_key.
Another tip when working with large databases: Field/Record size.
Although VARCHAR fields don't occupy more space in the database than the actual string lengths require, there is one often overlooked consequence of having some "head room" in VARCHAR field sizes:
When selecting rows from a table, equally sized record buffers are allocated, so the size is determined from max possible VARCHAR lengths. This is true when selecting from a client, but also when the server is storing a copy of the record (multi-generation handling) while updating & deleting !
So, watch out ! The following update statement will result in the server allocating 100Mb of RAM per some 80-90,000 records.
Table:
CREATE TABLE TEST (
ID INTEGER,
STATUS INTEGER,
SOMESTRING VARCHAR 1024);
UPDATE TEST SET STATUS = 1;
...so, trim your fields !
General
The RDB$DB_KEY may be used is an unique key representing no cost in maintenance or in use. In stored procedures and as a parameter in an update- or delete-sql it is the most efficient record identification you may find in InterBase.
The db_key is not troubled with index balancing,
The db_key is available also when having no unique index. Very useful when it is desirable to drop a unique key while performing updates or deletes.
The db_key does not depend on available server RAM to be efficient, so it's more and more useful as the ration [size of dataset] / [database RAM size] grows. It's possible to perform fast updates/deletes with very little RAM allocated on the server using the db_key.
The db_key is slightly faster than locating rows through an unique index.
No PLANS necessary whatsoever.
Some tips
Use DB_KEY with a FOR SELECT loop when a large part of the records are to be updated, and you would normally use a where-clause.
In such cases, indices are not very efficient.
Example: Different possible sources for getting a certain value, no joins. Approx. 3/4 of the records will be updated:
CREATE PROCEDURE COMPUTE_DELIVER_BONUS
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_V0 VARCHAR(10);
DECLARE VARIABLE v_V2 VARCHAR(10);
DECLARE VARIABLE v_V6 VARCHAR(10);
DECLARE VARIABLE v_VK1 VARCHAR(10);
DECLARE VARIABLE v_OT VARCHAR(2);
DECLARE VARIABLE v_BONUS FLOAT;
BEGIN
FOR SELECT
RDB$DB_KEY, V0, V6, VK1, V2, ORDER_TYPE
FROM
INVOICE
WHERE
DIREKTE <> 9
INTO :v_KEY, :v_V0, :v_V6, :v_VK1, :v_V2, :v_OT
DO
BEGIN
SELECT MAX(VK2_NUM1) FROM VK2_IMP
WHERE
VK2 = :v_V6 AND
VK2_TEKST1 = :v_VK1 AND
(VK2_NAVN ="A" OR VK2_NAVN = :v_OT) AND
:v_V2 >= VK2_TEKST2 AND
:v_V2 <= VK2_TEKST3 AND
:v_V0 >= VK2_TEKST4 AND
:v_V0 <= VK2_TEKST5
INTO :v_BONUS;
IF (:v_BONUS = 0 OR :v_BONUS IS NULL) THEN BEGIN
SELECT MAX(VK2_NUM1) FROM VK2_IMP
WHERE
VK2 = :v_V6 AND
VK2_TEKST1 = :v_VK1 AND
(VK2_NAVN ="A" OR VK2_NAVN = :v_OT) AND
:v_V2 >= VK2_TEKST2 AND
:v_V2 <= VK2_TEKST3 AND
VK2_TEKST5 = "0000000"
INTO :v_BONUS;
END;
IF (:v_BONUS IS NULL) THEN
v_BONUS = 0;
UPDATE INVOICE SET COST5 = BASIC_COST * :v_BONUS /1000000
WHERE RDB$DB_KEY = :v_KEY;
END
END
Avoid creating an index needed only for conditional updates/deletes, replace with FOR SELECT loop and DB_KEY
Indexing has a cost: It's slowing down delete and update. And, if you do a lot of changes in indexed columns, indices are getting unbalanced, with a penalty in performance.
When you need to do a conditional update / delete on a table it may be a solution to use the db_key, where no indices are needed. The larger the ratio updates / rowcount, the larger is the gain of using db_key.
Setting a value in table based on values in joined tables - use an optimized FOR SELECT loop with DB_KEY and values from joined tables selected into variables
One way to do fast updates when you want to set a value in a table based on field values of a related table, is to create one FOR SELECT statement with related tables joined, and perform updates based on the db_key value of the driving table.
Example: Updating value in smaller table with value retrieved from master table:
CREATE PROCEDURE UPD_RESKONT_IMP_FROM_Fimp
AS
DECLARE VARIABLE v_DBKey CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_Faktno VARCHAR(10);
DECLARE VARIABLE v_FaktDato DATE;
DECLARE VARIABLE v_K0 VARCHAR(10);
DECLARE VARIABLE v_FAKT_PERIODE INTEGER;
DECLARE VARIABLE v_IMPORT_PERIODE INTEGER;
DECLARE VARIABLE v_RESKONT_PERIODE INTEGER;
BEGIN
SELECT
CAST(DATA AS INTEGER)
FROM
PROD_PARAMS
WHERE
ID_KEY = "IMPORT_PER"
INTO
:v_IMPORT_PERIODE;
FOR
SELECT
R.RDB$DB_KEY,
F.FAKTNO,
F.FAKT_DATO,
F.K0,
F.FAKT_PERIODE
FROM
RESKONT_IMP R
LEFT JOIN
INVOICE_IMP F
ON
(R.FAKTNO = F.FAKTNO)
WHERE
F.FAKT_DATO IS NOT NULL
OR R.IMPORT_PERIODE = :v_IMPORT_PERIODE
INTO
:v_DBKey,
:v_Faktno,
:v_FaktDato,
:v_K0,
:v_FAKT_PERIODE
DO BEGIN
IF (:v_FaktDato IS NULL) THEN
v_RESKONT_PERIODE = NULL;
ELSE
v_RESKONT_PERIODE = :v_IMPORT_PERIODE;
UPDATE
RESKONT_IMP
SET
FAKT_DATO = :v_FaktDato,
FAKT_PERIODE = :v_FAKT_PERIODE,
IMPORT_PERIODE = :v_RESKONT_PERIODE,
K0 = :v_K0
WHERE
RDB$DB_KEY = :v_DBKey;
END /*FOR SELECT*/
END
Example: Joining master table with 2 smaller tables:
CREATE PROCEDURE ADD_CENTRALLAGER
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_IsCL VARCHAR(2); /*K4 er Centrallager*/
DECLARE VARIABLE v_HasCL VARCHAR(2); /*K4 har Centrallager*/
DECLARE VARIABLE v_V7 VARCHAR(2);
BEGIN
FOR
SELECT
F.RDB$DB_KEY,
K_Has.K2_TEKST1,
V_Is.V7
FROM
INVOICE_IMP F
LEFT JOIN
K2_IMP K_Has
ON (F.K4 = K_Has.K2)
LEFT JOIN
V7_IMP V_Is
ON (F.K4 = V_Is.V7)
INTO
:v_Key,
:v_HasCL,
:v_IsCL
DO BEGIN
IF (:v_IsCL IS NULL) THEN
:v_V7 = :v_HasCL;
ELSE
:v_V7 = :v_IsCL;
UPDATE
INVOICE_IMP
SET
V7 = :v_IsCL
WHERE
RDB$DB_KEY = :v_KEY;
END
END
Example: Joining master table with smaller table joined twice:
CREATE PROCEDURE ADD_REGIONS /*Region*/
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_K3_S VARCHAR(2);
DECLARE VARIABLE v_K3_L VARCHAR(2);
BEGIN
FOR
SELECT
F.RDB$DB_KEY,
KSelg.K3,
KLev.K3
FROM
INVOICE_IMP F
LEFT JOIN
K2_IMP KSelg
ON
(F.K2 = KSelg.K2)
LEFT JOIN
K2_IMP KLev
ON
(F.K4 = KLev.K2)
INTO
:v_KEY,
:v_K3_S,
:v_K3_L
DO
UPDATE
INVOICE_IMP
SET
K3 = :v_K3_L,
K9 = :v_K3_S
WHERE
RDB$DB_KEY = :v_KEY;
END
Example: Master table joined with two smaller tables:
CREATE PROCEDURE ADJUST_MHO_FIMP
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE pV7 VARCHAR(20); /*=Centrallager*/
DECLARE VARIABLE pV5_T1 VARCHAR(20); /*=MHO Clager*/
DECLARE VARIABLE pV5_T2 VARCHAR(20); /*=MHO Inte Cl*/
DECLARE VARIABLE pMHO VARCHAR(5); /*=MHO justerad*/
BEGIN
/* Sett justert MHO for selgende filial som ikke er sentrallager*/
FOR
SELECT
F.RDB$DB_KEY,
V.V5_TEKST1,
V.V5_TEKST2,
CL.V7
FROM
INVOICE_IMP F
LEFT JOIN
V5_IMP V
ON
(F.V5 = V.V5)
LEFT JOIN
V7_IMP CL
ON
(F.K2_LIKO = CL.V7)
INTO
:v_KEY,
:pV5_T1,
:pV5_T2,
:pV7
DO BEGIN
IF (:pV7 IS NULL OR :pV7 = "") THEN
pMHO = :pV5_T2;
ELSE
pMHO = :pV5_T1;
UPDATE
INVOICE_IMP
SET
MHO_SELG = :pMHO
WHERE
RDB$DB_KEY = :v_KEY;
END
END