Use Indices Effectively

Define indices on all columns that you use in joins or sorts. Indices are Balanced-Tree data structures that provide a great improvement in the speed of looking up values in the columns for which the indices are defined. This is especially useful when sorting by a particular column or joining two tables over a particular column they have in common.

Remember that an InterBase index is specific to the direction of sorting (ASCENDING vs. DESCENDING), and the direction of an index must match the direction of sorting to be employed during a sort. If you will be doing both directions of sorting, create two indices on the sort key(s). INSERTing and UPDATEing data requires indices to be updated, which can cause performance to suffer during data INSERT or UPDATE. Some cost incurred while data is entered can result in a big performance win during later data queries.

To minimize the performance hit during INSERT, consider temporarily disabling indices during high-volume INSERTs. This will "turn off" the indices, making them unavailable to help speed up queries, but also making them not be updated by data INSERTs. Then re-enable the indices after INSERTing data. This will update and rebalance the indices once for all the inserted data.

Periodically, rebuild indices by turning them off and on (ALTER INDEX name INACTIVE; followed by ALTER INDEX name ACTIVE;). Recalculate the index selectivity (SET STATISTICS INDEX name;) if a change to the table affects the average distribution of data values. Indices are also rebuilt in this same way during a restore.

Optimize Queries

InterBase will analyze a query and the optimizer will make its best estimate of which indices should be employed to speed up the query. But no automatic optimizer is infallible. For complex queries that will be run frequently, or will be applied to large amounts of data, design the query carefully. Use the SET PLAN option in the ISQL tool to test queries and see their performance and what the optimizer chooses as the query plan. InterBase V4.0, V4.1 and V4.2 has some known bugs such that queries using the JOIN syntax in the ANSI SQL-92 language are not parsed and optimized properly. Avoid this syntax, or if you must use it, specify your own query plan.

Tune InterBase's Cache

InterBase maintains its own cache in the server's RAM of database pages currently in use. The default cache size in V4.0 is 75 database pages. In V4.2, it is 256 database pages (this is shared by all clients who connect to a given database). For a database with 1Kb database pages, this is still only 256Kb. Most servers have a lot of memory. Put it to use! Try tuning the cache up from its default of 256 database pages to as many as 10,000 database pages. At some point, you will find diminishing returns, but some experimentation will reveal that point.

Backup and Restore

There are several benefits to doing periodic backup and restore of an InterBase database using the GBAK utility:

  • rebuild indices;
  • eliminate obsolete record versions ("garbage");
  • defragment database pages;
  • rewrite database tables contiguously; and
  • makes a backup of your data!

Use Client-Server Processing

User Defined Functions (UDFs), Triggers, Stored Procedures and Select Procedures can enable your database to perform a lot of computation on the server, which is probably a much more powerful computer than your client workstations. This technique also avoid unnecessary network traffic. Refer to the documentation (see the InterBase Programmer's Guide and the Data Definition Guide) for details on UDFs, Triggers and Stored Procedures.

Use the Operating System's Disk Caching

Most operating systems offer a cached or buffered disk I/O feature. This allows programs to perform their incremental disk writes to a cache in the server's RAM. The OS will decide to write the contents of the cache to the physical hard disk periodically. Saving up small disk writes and doing them in one larger operation can greatly increase the performance of I/O.

Using the disk cache is sometimes called asynchronous I/O or cached writes. Overriding the cache and forcing I/O operations to write directly to the disk is called synchronous I/O or forced writes. Asynchronous I/O has the benefit of improved performance, but RAM is a volatile storage medium and it can lose data if there is a loss of power or the server crashes or reboots without flushing the cache to disk. Synchronous I/O ensures that nothing will be lost because it is in the cache, but direct disk I/O is many times slower than writing to RAM.

InterBase database I/O allows asynchronous I/O and also synchronous I/O. The GFIX -WRITE SYNC MYDATABASE.GDB command makes I/O on that database use forced writes. The GFIX -WRITE ASYNC MYDATABASE.GDB command makes I/O on that database use cached writes.

Asynchronous I/O can gain a lot of performance benefit for InterBase, but measures should be taken to protect against data loss. For example, use disk mirroring to maintain a duplicate copy of the data. Also, use an Uninterruptible Power Supply (UPS) and surge protector to guarantee a steady supply of power to the server. With measures such as these, asynchronous I/O can provide benefits without sacrificing safety.

Administer the Network Protocol

TCP/IP is a much faster network protocol than either SPX or NetBEUI. Connecting to your NetWare or Windows NT server using TCP/IP will result in better performance.

The NetBEUI service has a low priority on NT servers, and there is a way in the operating system to increase the priority of the NetBEUI service. Some users report improved performance after doing this. Refer to your Windows NT documentation for instructions on how to tune the priority of NetBEUI services.

NOTE: Starting in Windows NT 4.0, the performance of Microsoft's NetBEUI implementation has improved. It is now at least as good as Microsoft's TCP/IP implementation. However, on a busy LAN, NetBEUI is still limited in that it is a connectionless protocol. This means that every host on the LAN must listen to every packet to check the intended recipient. As more hosts broadcast on the LAN, every host must handle an increasing load of "wrong numbers".

Choose Your Operating System

UNIX is faster at multitasking than Windows NT or NetWare, even on the same hardware. If speed is of utmost concern, consider using a UNIX server. If you have an investment in Intel hardware, SCO UNIX can run on most Intel hardware.

Novell NetWare can be used as a file server in addition to an InterBase database server, but when it is, the InterBase NLM takes a back seat in priority to the file services. File services have highest priority, over all other NLMs. The result is that as users read and write files on NetWare volumes belonging to the InterBase server machine, the InterBase performance (and that of all other NLMs on that server) suffers. The solution is to give InterBase its own dedicated server, that does not function as a file server.

On Windows NT Server, by default the server is configured to give priority to filesharing services. You can change this configuration on the server in: Control Panel -> Network -> Installed Network Software -> Configure. Switch this to "balance or database server". This change can result in a dramatic improvement of performance for InterBase.

Consider Garbage Collection

By default, InterBase databases have a built-in function to automatically sweep old record versions when they become too numerous. The drawback of this method is that the client process that is unlucky enough to start the transaction that hits the threshold for this automatic garbage collection has to bear the workload of the database sweep. This client process slows down while the sweep is progressing. Disable automatic garbage collection (using gfix -h 0) in favor of scheduled database sweeps (using gfix -s). This will eliminate the client performance hit.

Doing a backup and restore effectively does the same thing as a full sweep of the database. A backup only backs up the most recent version of each record. A backup never backs up back record versions. Thus when the data is restored, there is only one version of each record restored. There are also other benefits of doing a periodic backup and restore (see the section earlier in this web page).

Speaking of garbage collection, many programmers don't realize they need to START and COMMIT their transactions against an InterBase database. Open transactions prevent the periodic sweeps from completing garbage collection, and performance will suffer progressively over time.

Normalize Your Database

Design your database with proper normalization of data. Records that have lots of repeating groups of fields are larger than they need to be. Large records can increase the cost of sorting, and also cause records to span more pages than is necessary, resulting in more page fragmentation and needlessly large databases.

Code for Concurrency

Design your application so that it resolves transactions as quickly as possible after starting transactions. Often application programmers start a transaction and then display a data entry screen for the user. Instead, get the data from the user first, then start a transaction to INSERT the data, then COMMIT the transaction immediately. The purpose of this is to shorten the duration of transactions, which decreases the likelihood that two concurrent applications will conflict on a record lock. It also lets you to avoid a lot of ROLLBACKs and network traffic, since your application decides when to submit data and when to discard it.

Program with InterBase API

A program written in C using embedded SQL performs better than an InterBase client application written in Paradox, Delphi or other visual client tools. The idea is to program directly to the InterBase API in GDS.DLL instead of adding layers of middleware plus a network protocol on top of it.

Like this post? Share on: TwitterFacebookEmail

Related Articles



Reading Time

~7 min read