box: handle stale multikey tuple fields#11550
Merged
locker merged 1 commit intotarantool:masterfrom Jun 2, 2025
Merged
Conversation
ac069cc to
1fdbf30
Compare
After a multikey index is altered or dropped, tuples that were inserted while it existed still reference the format with multikey parts. Actually, that multikey format field shouldn't be used anymore because multikey index is gone, but that's not true. It happens because format is arranged to a JSON tree, and this tree has an interesting property. Imagine having a field with path '[1][*]' in the tree. Then, when looking for a field '[1][1]' (or '[1]["abc"]', the second key can be anything), the field with path '[1][*]' will be returned since '*' is actually wildcard. Hence, even if the format is stale and there are no multikey fields in actual format, we can obtain a multikey field even if we didn't search for it and `multikey_idx` will be MULTIKEY_NONE in this case. Let's handle this scenario - simply ignore offset slot when `multikey_idx` is MULTIKEY_NONE for a multikey field. This bug could lead to assertion failure in Debug build and UB or even crash in Release build because when one builds an index over the field that was covered by multikey index, field map of tuples referencing the old format would be incorrect - it would store offset of multikey array instead of actual field. Closes tarantool#11291 NO_DOC=bugfix
nshy
approved these changes
Jun 2, 2025
locker
approved these changes
Jun 2, 2025
lenkis
approved these changes
Jun 2, 2025
Member
|
Cherry-picked to 3.2, 3.3, 3.4. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After a multikey index is altered or dropped, tuples that were inserted
while it existed stil reference the format with multikey parts. Actually,
that multikey format field shouldn't be used anymore because multikey
index is gone, but that's not true. It happens because format is arranged
to a JSON tree, and this tree has an interesting property. Imagine having
a field with path '[1][*]' in the tree. Then, when looking for a field
'[1][1]' (or '[1]["abc"]', the second key can be anything), the field with
path '[1][*]' will be returned since '*' is actually wildcard. Hence,
even if the format is stale and there are no multikey fields in actual
format, we can obtain a multikey field even if we didn't search for it
and
multikey_idxwill be MULTIKEY_NONE in this case. Let's handle thisscenario - simply ignore offset slot when
multikey_idxis MULTIKEY_NONEfor a multikey field.
Note that this bug could lead to assertion failure in Debug build and UB
or even crash in Release build because when one builds an index over the
field that was covered by multikey index, field map of tuples referencing
the old format would be incorrect - it would store offset of multikey
array instead of actual field.
Closes #11291