by Bjørge Sæther

About the sample code

Note

From Firebird 2.0 declare the RDB$DB_KEY variable in a stored procedure as CHARACTER SET OCTETS.

The examples below are the results of months of efforts trying to perform a large financial analysis on a large database (> 4Gb) within reasonable timings. The real breakthrough came with the introduction of the RDB$DB_KEY. In addition to enormous speed increase, it minimized the need for indexing, thus making both development and implementation both safer and easier.

It should be emphasized, though, that you should make comparisons from time to time. After "having seen the light", you may easily overlook simpler (and faster) possible statements.

The samples could of course be rewritten in a more tutorial fashion. This takes quite some time, and I also wanted to show a few real-world examples (as opposed to the "CUSTOMER - like" examples).

And remember: These procedures are being run on tables having up to 6 million rows. The "success stories" of some of them is like reducing execution time from 1 hr 30 min down to 7 min when introducing the db_key.

Another tip when working with large databases: Field/Record size.

Although VARCHAR fields don't occupy more space in the database than the actual string lengths require, there is one often overlooked consequence of having some "head room" in VARCHAR field sizes:

When selecting rows from a table, equally sized record buffers are allocated, so the size is determined from max possible VARCHAR lengths. This is true when selecting from a client, but also when the server is storing a copy of the record (multi-generation handling) while updating & deleting !

So, watch out ! The following update statement will result in the server allocating 100Mb of RAM per some 80-90,000 records.

Table:

CREATE TABLE TEST (
   ID INTEGER,
   STATUS INTEGER,
   SOMESTRING VARCHAR 1024);

UPDATE TEST SET STATUS = 1;

...so, trim your fields !

General

The RDB$DB_KEY may be used is an unique key representing no cost in maintenance or in use. In stored procedures and as a parameter in an update- or delete-sql it is the most efficient record identification you may find in InterBase.

The db_key is not troubled with index balancing,

The db_key is available also when having no unique index. Very useful when it is desirable to drop a unique key while performing updates or deletes.

The db_key does not depend on available server RAM to be efficient, so it's more and more useful as the ration [size of dataset] / [database RAM size] grows. It's possible to perform fast updates/deletes with very little RAM allocated on the server using the db_key.

The db_key is slightly faster than locating rows through an unique index.

No PLANS necessary whatsoever.

Some tips

Use DB_KEY with a FOR SELECT loop when a large part of the records are to be updated, and you would normally use a where-clause.

In such cases, indices are not very efficient.

Example: Different possible sources for getting a certain value, no joins. Approx. 3/4 of the records will be updated:

CREATE PROCEDURE COMPUTE_DELIVER_BONUS
AS
  DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
  DECLARE VARIABLE v_V0 VARCHAR(10);
  DECLARE VARIABLE v_V2 VARCHAR(10);
  DECLARE VARIABLE v_V6 VARCHAR(10);
  DECLARE VARIABLE v_VK1 VARCHAR(10);
  DECLARE VARIABLE v_OT VARCHAR(2);
  DECLARE VARIABLE v_BONUS FLOAT;
BEGIN
  FOR SELECT
    RDB$DB_KEY, V0, V6, VK1, V2, ORDER_TYPE
  FROM
    INVOICE
  WHERE
    DIREKTE <> 9
  INTO :v_KEY, :v_V0, :v_V6, :v_VK1, :v_V2, :v_OT
DO
  BEGIN
    SELECT MAX(VK2_NUM1) FROM VK2_IMP
       WHERE
         VK2 = :v_V6 AND
         VK2_TEKST1 = :v_VK1 AND
         (VK2_NAVN ="A" OR VK2_NAVN = :v_OT) AND
         :v_V2 >= VK2_TEKST2 AND
         :v_V2 <= VK2_TEKST3 AND
         :v_V0 >= VK2_TEKST4 AND
         :v_V0 <= VK2_TEKST5
       INTO :v_BONUS;
    IF (:v_BONUS = 0 OR :v_BONUS IS NULL) THEN BEGIN
      SELECT MAX(VK2_NUM1) FROM VK2_IMP
         WHERE
           VK2 = :v_V6 AND
           VK2_TEKST1 = :v_VK1  AND
           (VK2_NAVN ="A" OR VK2_NAVN = :v_OT) AND
           :v_V2 >= VK2_TEKST2 AND
           :v_V2 <= VK2_TEKST3 AND
           VK2_TEKST5 = "0000000"
         INTO :v_BONUS;
    END;
    IF (:v_BONUS IS NULL) THEN
      v_BONUS = 0;
    UPDATE INVOICE SET COST5 = BASIC_COST * :v_BONUS /1000000
     WHERE RDB$DB_KEY = :v_KEY;
  END
END

Avoid creating an index needed only for conditional updates/deletes, replace with FOR SELECT loop and DB_KEY

Indexing has a cost: It's slowing down delete and update. And, if you do a lot of changes in indexed columns, indices are getting unbalanced, with a penalty in performance.

When you need to do a conditional update / delete on a table it may be a solution to use the db_key, where no indices are needed. The larger the ratio updates / rowcount, the larger is the gain of using db_key.

Setting a value in table based on values in joined tables - use an optimized FOR SELECT loop with DB_KEY and values from joined tables selected into variables

One way to do fast updates when you want to set a value in a table based on field values of a related table, is to create one FOR SELECT statement with related tables joined, and perform updates based on the db_key value of the driving table.

Example: Updating value in smaller table with value retrieved from master table:

CREATE PROCEDURE UPD_RESKONT_IMP_FROM_Fimp
AS
  DECLARE VARIABLE v_DBKey CHAR(8) CHARACTER SET OCTETS;
  DECLARE VARIABLE v_Faktno VARCHAR(10);
  DECLARE VARIABLE v_FaktDato DATE;
  DECLARE VARIABLE v_K0 VARCHAR(10);
  DECLARE VARIABLE v_FAKT_PERIODE INTEGER;
  DECLARE VARIABLE v_IMPORT_PERIODE INTEGER;
  DECLARE VARIABLE v_RESKONT_PERIODE INTEGER;
BEGIN
  SELECT
    CAST(DATA AS INTEGER)
  FROM
    PROD_PARAMS
  WHERE
    ID_KEY = "IMPORT_PER"
  INTO
    :v_IMPORT_PERIODE;
  FOR
    SELECT
      R.RDB$DB_KEY,
      F.FAKTNO,
      F.FAKT_DATO,
      F.K0,
      F.FAKT_PERIODE
    FROM
      RESKONT_IMP R
    LEFT JOIN
      INVOICE_IMP F
    ON
      (R.FAKTNO = F.FAKTNO)
    WHERE
      F.FAKT_DATO IS NOT NULL
      OR R.IMPORT_PERIODE = :v_IMPORT_PERIODE
    INTO
      :v_DBKey,
      :v_Faktno,
      :v_FaktDato,
      :v_K0,
      :v_FAKT_PERIODE
   DO BEGIN
    IF (:v_FaktDato IS NULL) THEN
      v_RESKONT_PERIODE = NULL;
    ELSE
      v_RESKONT_PERIODE = :v_IMPORT_PERIODE;
    UPDATE
      RESKONT_IMP
    SET
      FAKT_DATO      = :v_FaktDato,
      FAKT_PERIODE   = :v_FAKT_PERIODE,
      IMPORT_PERIODE = :v_RESKONT_PERIODE,
      K0             = :v_K0
    WHERE
      RDB$DB_KEY = :v_DBKey;
  END /*FOR SELECT*/
END

Example: Joining master table with 2 smaller tables:

CREATE PROCEDURE ADD_CENTRALLAGER
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_IsCL VARCHAR(2); /*K4 er Centrallager*/
DECLARE VARIABLE v_HasCL VARCHAR(2); /*K4 har Centrallager*/
DECLARE VARIABLE v_V7 VARCHAR(2);
BEGIN
  FOR
    SELECT
      F.RDB$DB_KEY,
      K_Has.K2_TEKST1,
      V_Is.V7
    FROM
      INVOICE_IMP F
      LEFT JOIN
      K2_IMP K_Has
    ON (F.K4 = K_Has.K2)
      LEFT JOIN
      V7_IMP V_Is
    ON (F.K4 = V_Is.V7)
      INTO
      :v_Key,
      :v_HasCL,
      :v_IsCL
 DO BEGIN
    IF (:v_IsCL IS NULL) THEN
      :v_V7 = :v_HasCL;
    ELSE
      :v_V7 = :v_IsCL;
       UPDATE
        INVOICE_IMP
      SET
        V7 = :v_IsCL
      WHERE
        RDB$DB_KEY = :v_KEY;
  END
END

Example: Joining master table with smaller table joined twice:

CREATE PROCEDURE ADD_REGIONS   /*Region*/
AS
DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
DECLARE VARIABLE v_K3_S VARCHAR(2);
DECLARE VARIABLE v_K3_L VARCHAR(2);
BEGIN
  FOR
    SELECT
      F.RDB$DB_KEY,
      KSelg.K3,
      KLev.K3
    FROM
      INVOICE_IMP F
    LEFT JOIN
      K2_IMP KSelg
    ON
      (F.K2 = KSelg.K2)
    LEFT JOIN
      K2_IMP KLev
    ON
      (F.K4 = KLev.K2)
    INTO
      :v_KEY,
      :v_K3_S,
      :v_K3_L
  DO
    UPDATE
      INVOICE_IMP
    SET
      K3 = :v_K3_L,
      K9 = :v_K3_S
    WHERE
      RDB$DB_KEY = :v_KEY;
END

Example: Master table joined with two smaller tables:

CREATE PROCEDURE ADJUST_MHO_FIMP
AS
  DECLARE VARIABLE v_KEY CHAR(8) CHARACTER SET OCTETS;
  DECLARE VARIABLE pV7 VARCHAR(20); /*=Centrallager*/
  DECLARE VARIABLE pV5_T1 VARCHAR(20); /*=MHO Clager*/
  DECLARE VARIABLE pV5_T2 VARCHAR(20); /*=MHO Inte Cl*/
  DECLARE VARIABLE pMHO VARCHAR(5); /*=MHO justerad*/
BEGIN
/* Sett justert MHO for selgende filial som ikke er sentrallager*/
  FOR
    SELECT
      F.RDB$DB_KEY,
      V.V5_TEKST1,
      V.V5_TEKST2,
      CL.V7
    FROM
      INVOICE_IMP F
    LEFT JOIN
      V5_IMP V
    ON
      (F.V5 = V.V5)
    LEFT JOIN
      V7_IMP CL
    ON
      (F.K2_LIKO = CL.V7)
    INTO
      :v_KEY,
      :pV5_T1,
      :pV5_T2,
      :pV7
  DO BEGIN
    IF (:pV7 IS NULL OR :pV7 = "") THEN
      pMHO = :pV5_T2;
    ELSE
      pMHO = :pV5_T1;
    UPDATE
      INVOICE_IMP
    SET
      MHO_SELG = :pMHO
    WHERE
      RDB$DB_KEY = :v_KEY;
  END
END

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Bjørge Sæther

Reading Time

~5 min read

Published

Category

Articles

Tags