Scott Price wrote:

I remember from Helen Borrie's book that it said that the IN() predicator doesn't use indexes to select the results. However, I know this is old data.

The question is, does this still apply, and if it does are there any plans to change/fix this to use indexes?

If it does still apply, what would be the recommended equivalent approach in SQL to take instead to maintain a useful level of performance?

Helen Borrie answers:

Nope - at least I hope not. NOT IN() can't use indexes, obviously.

Use IN() for a list of constants (within reason) - it resolves to a series of Or'ed equality tests. Try something different (such as an outer join) to do the NOT IN() logic if you possibly can (eliminating matches using WHERE is null). IN (subquery expression) gets resolved to an EXISTS() predicate, which is short logic and, at worst, will walk the whole result. For your NOT IN(), NOT EXISTS() is a near (but not exact) equivalent. Again, the outer join will get a faster result.

Svein Erling Tysvaer answers:

I think IN has always been able to use indexes - mind you when used in the sense 'IN (1, 5, 3)'. NOT IN is a different matter.

One - unfortunately - common way to use IN, is to use IN(SELECT ...), which can never use indexes and used to execute the subselect for every possible row (needless to say, this was SLOW when the subselect returned lots of rows). I think later versions of Firebird try to transform such a construct into WHERE EXISTS(...), but I doubt that is possible in all cases (haven't thought too much about it, but it sounds difficult to cover all theoretical cases).

So, used with constant values, I like IN, whereas used with subselects, I dislike it - EXISTS (SELECT ...) is at least as easy to read and the best thing you can hope for when using IN (SELECT ...) is that Firebird is able to transform it into EXISTS before it executes anything.

Dmitry Yemanov adds:

It was always executed in the manner similar to the EXIST predicate. The recent FB versions just cover more possible cases, like an embedded (into a sub-query) GROUP BY clause or UNION. For such a transformation being possible, IN is internally made correlated, so that:

<value> IN ( SELECT <field> FROM <tab> WHERE <cond> )

becomes a kind of:

EXISTS ( SELECT * FROM <tab> WHERE <cond> AND <field> = <value> )

And, if the injected equality predicate can be matched to an available index, the subquery becomes indexed and it's executed similar to an inner join but with a fixed join order (outer select -> subselect). So, generally, IN always use indices, but it's done for the inner stream -- not for the outer one as many people expect it to. This is not necessarily bad, although sometimes it is.

Like this post? Share on: TwitterFacebookEmail


Related Articles


Author

Firebird Community

Published

Category

Gems from Firebird Support list

Tags