Daniel L. Miller wrote:

Having just read the thread on CHAR vs SMALLINT for storing a single byte

Does the order/sequence of columns in a table make a difference in storage? I.e., is a table of CHAR(1), CHAR(1), CHAR(1), CHAR(1), SMALLINT, VARCHAR(50) stored differently than CHAR(1), SMALLINT, CHAR(1), VARCHAR(50), CHAR(1), CHAR(1)?

Ann W. Harrison answers:

Yes it is, but it's not worth worrying about. There are two considerations: the number of pad bytes introduced to maintain field alignment in the record buffer and the effect of run-length compression.

The record buffer is constructed so fields are aligned on their natural boundaries. If you've got a CHAR(1) field followed by an int, Firebird with put in three pad bytes so the int starts on a four-byte boundary. It's more efficient to store fields based on their alignment - highest alignment first. If you don't like the way that looks when you do a SELECT *, use the position attribute on the fields to define the output field order.

Run-length compression is data dependent - runs of identical bytes are compressed to a length byte and a data byte, so the amount of storage used will vary with the values stored. Note that the compression is byte-wise. Two-byte characters do not compress well. The biggest gain from compression is the elimination of trailing spaces in long strings and spaces or zeros in null fields. Putting nullable fields together helps compression especially if you separate character fields from dates and number that are zero filled. Varchar fields are both numbers and characters, so they're less compressible when they're null than CHAR. But they're also much easier to work with.

Disks are cheap.

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Firebird Community

Published

Category

Gems from Firebird Support list

Tags