dixonepperson wrrote:

I remember reading years ago that Firebird would compile multiple indexes based on the search requirement and that it was better to do indexes that way instead of creating multiple compound indexes.

eg:

Table1 (
field1,
field2,
field3,
)

this is better

Create Index idx1 on Table1 (field1)
Create Index idx2 on Table1 (field2)
Create Index idx3 on Table1 (field3)

than doing it this way

Create Index idx1 on Table1 (field1)
Create Index idx2 on Table1 (field2)
Create Index idx3 on Table1 (field3)

Create Index idx4 on Table1 (field1, field2)
Create Index idx5 on Table1 (field1, field3)

Is that still correct in 2.5?

Sean Leyne answers:

That approach is not true is 100% of cases. It depends on many factors:

  • How many rows are in the table?
  • How what is the selectivity of the columns?
  • Are there common fields/columns (or groups) used in search?

You need to test/evaluate the best approach for your requirements.

In our case, we have a table which has 5-120 millions of rows (depending on the client), for which the most common search is by 2 fields (Date & Channel). There could be between 2000 and 80000 total rows per Date, and there could be between 1 and 40 Channel values (very poor selectivity). But typically there are only 2000 rows per date+channel, so creating a Date+Channel (actually Channel+Date for the best index compression) yields search performance which is orders of magnitude better than using separate indexes. (SELECT * FROM Table WHERE Channel = xxx and Date = yyyy on 119 million rows returns 1160 rows in 1.5secs)

The engine uses that same index is also used when searches for a range of dates (for a single channel) is used (ie. WHERE Channel = xxx and Date BETWEEN xxx and xxxx).

Ann W. Harrison added:

Sean offered excellent suggestions and example. Just to be very clear... Even using separate indexes, Firebird would read only the rows that met both criteria. It would set a lot of bits in a couple of bitmaps as it read a lot of index pages that the compound index would avoid. There's a cost to that, for sure.

Note

The engine uses that same index is also used when searches for a range of dates (for a single channel) is used (ie. WHERE Channel = xxx and Date BETWEEN xxx and xxxx).

Right, That index could not be used if the query were WHERE Date = xxx and Channel between xxx and yyy - but that's not a logical query in Sean's database.

People who learned database design on other databases tend to create more indexes than are needed in Firebird. Those indexes add to the cost of inserting, updating, and deleting data. If you find yourself thinking that you need an index on fields A & B, and B & C, and C & B & A, and B, A, & C, then you should run some tests and see if you can live with one index on A, one on B, and one on C. Remember to include some data changes in the mix.

If your data is never modified or deleted and you can live with the load time, then add all the compound indexes your heart desires.

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Firebird Community

Published

Category

Gems from Firebird Support list

Tags