Brian wrote:

I have volunteered to write a statistics-type program in Lazarus/FreePascal under Linux and I need an embedded database, that means it has to be Firebird. I've used a lot of database software before, but never Firebird (nor Interbase, for that matter). I do NOT want to force users to install a full-blown database server in order to use the software.

My problem is that I must avoid duplicated records in the database, the unique key is a complicated structure containing four 16-bit words plus a 108-bit set of flags. The combination of the whole lot must be unique. I don't need to retrieve this data other than to check for duplicate records, so I can massage it in FreePascal so that it can go into the database in any form that's desirable.

If I were using PostgreSQL, I'd store the whole lot as a single 172-bit bitstring, but I can't find any mention of an equivalent data type in the Firebird documentation that I've been able to find.

So, how would you store this data for greatest speed/efficiency in checking for duplicates, please? I'm looking at a few million records in the database, and there's a record size of around 350-400 bytes.

Carlos H. Cantu answers:

Maybe storing it into a CHAR(22) using OCTECTS charset?

Helen's book describes octets as:

OCTETS | BINARY
Bytes which will not be interpreted as characters. Useful for storing binary data, GUID strings, hexidecimal numbers and for correcting string data that is causing transliteration errors. Aliased as BINARY from the “2” series onward.

Sean Leyne adds:

While this will work (OCTECTS) and is the smallest storage footprint/representation, I would recommend to sacrifice disk space and to store the key as a string using a base 36 encoding (0-1 + A-Z) -- it makes debugging via tools much easier. Difference space between OCTET and base 36 encoding would only be an additional 7 characters.

Ann W. Harrison answers:

If there's any chance that you'll ever need data from the four 16-bit words, I'd create a five part primary key with four small integers followed by a 14 byte string octets. Firebird index keys are compressed and rearranged so they compare bytewise regardless of the original format - single field or multiple, combinations of strings and numbers, etc. If it were me, I'd let Firebird create the key from data.

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Firebird Community

Published

Category

Gems from Firebird Support list

Tags