Greg Kay wrote:

We are using Classic server 1.52 on a Linux 8 Gb dual CPU opteron 242 with Windows clients. There are about 110 users with 450 connections

Database header page information:
      Oldest transaction      22951
      Oldest active           319143
      Oldest snapshot         317998
      Next transaction        572981

My question is why is the "Oldest snapshot" considerably less than the "Oldest active"? My understanding is that the "Oldest snapshot" should be close to the "Oldest Active".

Ann W.Harrison answers:

Sigh. The serious problem is the difference between Next transaction and Oldest active. But to answer your question ...

Each transaction has a lock on its own identity. Each lock has a name, which is the transaction id of the transaction that owns the lock, and a data space in which it keeps the identity of the oldest transaction that was running when it started.

When a transaction starts, it queries the the transaction locks in the lock table, looking for two pieces of information:

  1. the identity of the oldest transaction currently running and
  2. the identity of the oldest transaction that was running when any active transaction started.

The identity of the oldest running transaction is the minimum value of the lock name (for the transaction lock series, but "lock series" is an advance construct and won't be discussed until next semester.)

The identity of the oldest transaction running when any active transaction started is the data from the lock with the lowest name - or the minimum value of all data values in the transaction lock series.

The transaction puts the first value (oldest currently running) in its transaction lock data area. It uses the second value to determine what old versions can be garbage collected. Anything older than that value can be, nothing equal to or greater can, unless it's know to have rolled back.

OK so far? That was the easy part. It was designed for classic, where each connection has its own process but the state of transactions has to be shared among processes.

The oldest active is kept individual transaction blocks. However, new transactions get their transaction number from the database header page, so the header page must be written whenever a transaction starts. Since the whole page is written, and since people often want to know the state of transaction activity, the oldest active number is also written on the database header page when a transaction starts.

And, to make bookkeeping easier for a shared server, the database block also tracks the oldest active. But for reasons unclear to me, it tracks both the oldest active, and the transaction that was oldest active most recently, which it calls the oldest snapshot. (I think this is a bug). And that value - oldest snapshot - is also written into the database header page.

In a shared server, the oldest active and the oldest snapshot move along pretty smoothly together, barring some unusual combination of activity - lots of commits of old transactions just before the next transaction starts. In Classic, however, the individual processes have different ideas of what the previous oldest active (aka Oldest Snapshot) was because transactions start in different processes, so different processes will have read the lock table at different times. Given the number of connections and clients, that probably explains the difference you see.

Also, assuming we can improve the transaction movement, what else can we do to improve performance?

Firebird 2 improves the performance of garbage collection in two ways. First, indexes are restructured to eliminate the current ~n*n cost of removing duplicate entries. Second, the garbage collect thread is invoked only if removing old versions requires reading additional pages. On-page garbage collection is done cooperatively - and much faster.

Based on the recent article on the Lock Manager in "The Interbase and Firebird Developer Magazine" we have increased the lock manager settings to the maximum with considerable improvement (hash lengths have gone from min 700, avg 800, and max 900, to min 20, avg 40, or max 60, and the mutex wait is down to 2.5%). Is there anything else?

Lobby the developers to increase the maximum width of the hash table.

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Firebird Community

Published

Category

Gems from Firebird Support list

Tags