Skip to content

box: handle stale multikey tuple fields#11550

Merged
locker merged 1 commit intotarantool:masterfrom
drewdzzz:gh_11291
Jun 2, 2025
Merged

box: handle stale multikey tuple fields#11550
locker merged 1 commit intotarantool:masterfrom
drewdzzz:gh_11291

Conversation

@drewdzzz
Copy link
Contributor

@drewdzzz drewdzzz commented May 29, 2025

After a multikey index is altered or dropped, tuples that were inserted
while it existed stil reference the format with multikey parts. Actually,
that multikey format field shouldn't be used anymore because multikey
index is gone, but that's not true. It happens because format is arranged
to a JSON tree, and this tree has an interesting property. Imagine having
a field with path '[1][*]' in the tree. Then, when looking for a field
'[1][1]' (or '[1]["abc"]', the second key can be anything), the field with
path '[1][*]' will be returned since '*' is actually wildcard. Hence,
even if the format is stale and there are no multikey fields in actual
format, we can obtain a multikey field even if we didn't search for it
and multikey_idx will be MULTIKEY_NONE in this case. Let's handle this
scenario - simply ignore offset slot when multikey_idx is MULTIKEY_NONE
for a multikey field.

Note that this bug could lead to assertion failure in Debug build and UB
or even crash in Release build because when one builds an index over the
field that was covered by multikey index, field map of tuples referencing
the old format would be incorrect - it would store offset of multikey
array instead of actual field.

Closes #11291

@coveralls
Copy link

coveralls commented May 29, 2025

Coverage Status

coverage: 87.501% (-0.01%) from 87.515%
when pulling 934dd35 on drewdzzz:gh_11291
into e9a2071
on tarantool:master
.

@drewdzzz drewdzzz force-pushed the gh_11291 branch 2 times, most recently from ac069cc to 1fdbf30 Compare May 30, 2025 07:23
@drewdzzz drewdzzz marked this pull request as ready for review May 30, 2025 07:24
@drewdzzz drewdzzz requested a review from a team as a code owner May 30, 2025 07:24
After a multikey index is altered or dropped, tuples that were inserted
while it existed still reference the format with multikey parts. Actually,
that multikey format field shouldn't be used anymore because multikey
index is gone, but that's not true. It happens because format is arranged
to a JSON tree, and this tree has an interesting property. Imagine having
a field with path '[1][*]' in the tree. Then, when looking for a field
'[1][1]' (or '[1]["abc"]', the second key can be anything), the field with
path '[1][*]' will be returned since '*' is actually wildcard. Hence,
even if the format is stale and there are no multikey fields in actual
format, we can obtain a multikey field even if we didn't search for it
and `multikey_idx` will be MULTIKEY_NONE in this case. Let's handle this
scenario - simply ignore offset slot when `multikey_idx` is MULTIKEY_NONE
for a multikey field.

This bug could lead to assertion failure in Debug build and UB or even
crash in Release build because when one builds an index over the field
that was covered by multikey index, field map of tuples referencing the
old format would be incorrect - it would store offset of multikey array
instead of actual field.

Closes tarantool#11291

NO_DOC=bugfix
@nshy nshy removed their assignment Jun 2, 2025
@locker locker assigned drewdzzz and unassigned locker Jun 2, 2025
@drewdzzz drewdzzz added the full-ci Enables all tests for a pull request label Jun 2, 2025
@drewdzzz drewdzzz assigned locker and unassigned drewdzzz Jun 2, 2025
@locker locker merged commit 58c7b79 into tarantool:master Jun 2, 2025
58 of 59 checks passed
@locker
Copy link
Member

locker commented Jun 2, 2025

Cherry-picked to 3.2, 3.3, 3.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full-ci Enables all tests for a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion failure when trying to create a json-path index after deleting a multikey index over the same field

5 participants