Development News - December 2024

Bugs Found

Replication Error (with BLOBs)

GitHub Issue: 8317 Affected versions: 4.0.5, 5.0.1, 6.0 Initial

We are encountering the following error on the replica side:

ERROR: cannot update old BLOB
At segment 14297, offset 48

This error is not recoverable. There is no way to purge the transaction file which contains this error to allow for replication to continue.

Could you please answer whether any of the replicated tables contain multiple blob fields? And if so, is it possible that those multiple blob columns may share the same blob contents for some record(s) (e.g. col1 = :my_blob, col2 = :my_blob)?

Yes, there are dozens of tables and several dozens of Blobs. Although I can’t say for sure, it is certainly architecturally possible that some blob contents could be the same.

Bugs Fixed

Error: "Invalid clumplet buffer structure..." when using trusted auth

GitHub Issue: 8336 Affected versions: 4.0.5, 5.0.1, 6.0 Initial Fixed for: 4.0.6, 5.0.2, 6.0 Alpha 1

Happens when user account is a member of a lot of Windows Active Directory groups (few hundreds).

Cleanup batches if they were not released explicity before disconnection

Pull Request: 8341 Affected versions: 4.0, 5.0, 6.0 Initial Fixed for: 5.0.2, 6.0 Alpha 1

This avoids a resource leak (it's mostly about TempSpace). Easy to reproduce - just open a batch, push a significant number of records through the batch and disconnect. You'll get an assertion fb_assert(m_tempCacheUsage == 0); inside ~GlobalObjectHolder() and may also notice temporary files still existing.

MacOS ARM version requires Rosetta

GitHub Issue: 8334 Affected versions: 5.0.1, 6.0 Initial Fixed for: 5.0.2, 6.0 Alpha 1

Make asynchronous replica re-initialization reliable

Pull Request: 8324 Affected versions: 4.0, 5.0, 6.0 Initial Fixed for: 4.0.6, 5.0.2, 6.0 Alpha 1

Currently when a physical backup is performed, journal segment is switched from N to N+1 at the backup start so that backup file is ensured to contain only data up to sequence N (including it). However, some long-running writeable transaction could already have some its changes stored in segments <= N while a commit event will be stored in some later segment. After re-initialization at the replica side, we continue with segment N+1 and (a) have older changes lost and (b) error "Transaction X is not found" usually happens. It means that the replica is inconsistent and must be re-initialized again. But if the primary is under high load, this may happen over and over.

The solution is to not delete segments <= N immediately, but instead scan them to find the active transactions at the end of N, calculate the new replication OAT, delete everything < OAT and replay the journal (active transactions only) starting with OAT, then proceed normally with N+1 and beyond.

Missed records in replicated database

GitHub Issue: 8350 Affected versions: 4.0, 5.0.1, 6.0 Initial Fixed for: 4.0.6, 5.0.2, 6.0 Alpha 1

Issue was reported by user privately. After investigation it was found that master database contains records with zero blob_id and null flag not set.

It is not clear how to create such records. Though, gbak and few other places in code handle such condition, therefore it looks like known state.

Current issue is that such records is not replicated, i.e. it is not written into replication log by master and thus its missed in replica.

New features/improvements

Report unique usernames for isc_info_user_names

GitHub Issue: 8353 apply to: 4.0.6, 5.0.2, 6.0 Alpha 1

For administrators, isc_info_user_names reports the usernames of each connection. If multiple connections have the same username, then that name is repeated. Instead, it should report the unique usernames; this will reduce the amount of information transferred, reducing the chance of truncation.

Making this change should not break the API, as it makes no promises beyond reporting the usernames, though it might be good to ensure the username of the current connection remains the first as it is currently.