memtx: support functional indexes in mvcc#11094
memtx: support functional indexes in mvcc#11094sergepetrenko merged 16 commits intotarantool:masterfrom drewdzzz:mvcc_gc_ro
Conversation
|
@nshy |
There was a problem hiding this comment.
Great drop @drewdzzz! 🔥 I see multikey indexes support for MVCC (#8465) finally just behind the corner 😍
However, to proceed further, I would like to have another look on the functional indexes support for MVCC implementation after you split up the main commit as I propose. Also, I am not sure we can go so fast with the proposed GC and multikey index support changes — maybe we can defer their merge, and focus on merging the functional indexes support before the upcoming release.
These helpers are considered to fully unlink the story. However, they unlink it only from history chains - they doesn't unlink associated transaction statements or trackers. Let's move this logic there. Note that `memtx_tx_story_full_unlink_story_gc_step` isn't affected - it doesn't need to unlink anything except for stories since garbage collector doesn't delete stories with associated statements or trackers. Part of #10771 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
Functions initializing (or deinitializing) subsystems can use some internal functions, for example, to invalidate internal objects. That's exactly that we are going to do. The commit simply moves initializers of memtx mvcc subsystem to the bottom of the file to avoid ugly forward declarations of internal functions. Part of #10771 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
Currently, we call `memtx_tx_manager_free` after `box_free`. The problem is that we free Lua state even earlier, so when `box` is freed, Lua is unavailable. However, when `box_free` destroys space cache, it calls MVCC callback to invalidate deleted spaces and some replaces can happen. If the space has a functional index, replace will call a Lua function and it will crash Tarantool since Lua state is already released. Note that we cannot release Lua state later - we want Lua to release all tuples before disabling the subsystem. To solve the problem, let's initialize transaction manager at the end of `box_init` and free it at the beginning of `box_free`, before space cache is destroyed - it won't do any replaces during invalidation then. Also, the commit slightly updates `memtx_tx_manager_free` - now it also deletes all the stories and unreferences all the tuples. Before the commit, this job was done implicitly in `space_cache_destroy` - it removes all the spaces from space cache, thus, invalidates them (the reason of the crash described above, actually). Note that `memtx_tx_manager_free` doesn't bring spaces up to date - for example, index can contain an already deleted tuple after the function is called. Then, the function doesn't do any replaces - only deletes internal objects. It seems to be OK because it is called right before Tarantool is completely destroyed, so nobody is supposed to use the indexes. Along the way, the commit adds a new test covering shutdown with an active transaction for both `memtx` and `vinyl` engines - just to cover the commit with tests better. Closes #10771 NO_CHANGELOG=later NO_DOC=later
The commit adopts conventional approach with wrappers over virtual methods in mvcc (e.g, calling `tuple_hash` instead of `def->tuple_hash`) to make the code cleaner. Part of #11100 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
In a bunch of functions, related to handling of `count`, we use both `cmp_def` and `key_def` - one for tuple-with-tuple comparison and another one for tuple-with-key comparison. Actually, `cmp_def` can be used for both comparisons, so let's simply drop `key_def` in such places to simplify the code and future refactoring. Part of #11100 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
Sometimes tuple hints are necessary - for example, when index is functional. And we are going to support them in the nearest future, so let's use tuple hint whenever possible. Along the way, let's pass pre-calculated hints to helpers like `memtx_tx_tuple_matches` to calculate them once per tuple when handling `count` requests. Part of #11100 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
The commit encapsulates obtainment of tuple hint in memtx mvcc - the new function `memtx_tx_tuple_hint` should be used for that. In the next commits, it will be used to encapsulate functional indexes (functional keys are passed as hints to functions working with tuples). Part of #11100 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
The commit encapsulates call of `tuple_key_is_excluded` in memtx mvcc, the new function `memtx_tx_tuple_key_is_excluded` should be used instead. In the next commits, it will be used to encapsulate functional indexes (functional keys must be passed to `tuple_key_is_excluded` instead of tuples). Note that the new helpers must be called only for dirty tuples (we will use this property later). That's why the commit moves call of `tuple_key_is_excluded` in `memtx_tx_history_add_insert_stmt` below - we need to call the new helper when the story for the tuple is created because that's when it becomes dirty. Calling `index_replace` for excluded tuples is safe - it's just a no-op actually. Part of #11100 NO_TEST=refactoring NO_CHANGELOG=refactoring NO_DOC=refactoring
|
I forgot to initialize diff --git a/src/box/memtx_tx.c b/src/box/memtx_tx.c
index fc04f3f787..c7b4d70ba7 100644
--- a/src/box/memtx_tx.c
+++ b/src/box/memtx_tx.c
@@ -1126,6 +1126,7 @@ memtx_tx_story_new(struct space *space, struct tuple *tuple)
struct memtx_tx_stats *stats = &txm.story_stats[story->status];
memtx_tx_stats_collect(stats, pool->objsize);
story->tuple_is_retained = false;
+ story->has_func_key = false;
story->index_count = index_count;
story->add_stmt = NULL;
story->add_psn = 0; |
Currently, there are several places where we compare tuples with other tuples, keys, or even calculate their hashes. The problem is if the index is functional we should use the functional key instead of original tuple for such operations. Moreover, an attempt to call `tuple_hash` with functional index definition will fail with an assertion. Firstly, in order to be able to call Lua function inside read methods of a functional index, the commit replaces them with LuaC implementations. There is no mapping from tuple to its functional key for a particular index, so the commit adds one to MVCC - it stores functional keys of dirty tuples (note that amount of dirty tuples is far less than amount of all tuples). However, we can't just obtain all functional keys for a tuple when it becomes dirty - it can happen during a read request to another index, so we can be inside of an FFI call, so we are not allowed to call another Lua function to obtain the functional key. So let's just fill the mapping lazily. Moreover, in this case performance of requests to non-functional indexes won't be affected at all. Note that we could simply call the function of the index to obtain key of a tuple every time we need it, without any mapping. But such approach could dramatically slow things down. For example, if we used `count` often, we would need to call the function for every story on every count! Closes #11100 NO_CHANGELOG=later NO_DOC=later
Currently, when memtx mvcc is enabled, we call function of a functional index twice on each replace. Firstly, it is called under the hood when `index_replace` is called. Secondly, the function is called by MVCC to obtain the functional key - it is needed for collecting possible conflicts. The commit optimizes out the second function call - it makes Tarantool save functional key to MVCC during `index_replace`. Follow-up #11100 NO_CHANGELOG=later NO_DOC=later
All indexes return deleted tuples as a result (in memtx, at least), but by mistake we forgot to set result in functional memtx tree index - the patch fixes the mistake. Note that this small inaccuracy was breaking memtx MVCC since we rely on the replace result there. Closes #11099 NO_CHANGELOG=later NO_DOC=bigfix
In memtx MVCC, we track gap reads right in successors if they are available and then rely on this fact when providing transaction isolation. The problem is, we don't support successors in func indexes, so the isolation works incorrectly. The commit adds successor support to single-key func indexes. Follows up #11099 NO_CHANGELOG=later NO_DOC=later
Current design of memtx MVCC garbage collector produces to major problems. Both problems are caused by implicit index modification - when MVCC deletes the last story of an actually deleted tuple, it has to remove the tuple from the index. The first problem is if there is a functional index in the space, replace will call a Lua function to obtain the key. However, read methods in memtx (e.g. `select`) are called with FFI - it is faster than Lua/C, but have some restrictions, and one of them - Lua call during another FFI call is forbidden. So, using MVCC with function indexes can easily lead to a crash. The second problem is that we have to carefully use memtx handlers because they can modify index and break its iterators, so we always write a large "MEMTX MVCC GC BOUND" notification next to every such call and sometimes manually call the garbage collector. The solution of both problems is simple - just do not call GC in handlers called on read requests! But won't it slow down the garbage collector? Actually, almost nothing will be changed. That's how GC works now - on every creation of a memtx story it accumulates amount of steps for the story GC. Then, later, when the GC is called, it spends all the accumulated steps. The stories are created only in two cases: 1. Insert of a new tuple. Story is always created. 2. Read of an existent non-dirty tuple. Story is created only if there is an active transaction (IOW, only in non-autocommit mode). The commit removes calls of GC in handlers used by read requests, so now it is called when: 1. A write (insert, delete or update) happens. 2. Statements get prepared. 3. Statements get committed. 4. Helper `memtx_tx_clean_txn` is called - actually, it is called for each transaction when it is destroyed. So, after each story creation there will be call of the GC, hence it's useless to call the GC on read requests. Moreover, it leads to lots of problems. Let's just not do this. Closes #7867 Closes #10770 Closes #10775 NO_CHANGELOG=later NO_DOC=later
Helper memtx_tx_gc is implemented with FFI. It's bad because Tarantool isn't allowed to call Lua functions during FFI call, but this function can do it if there is a functional space - it can lead to a crash. Let's rewrite the helper with LuaC - it has no such limitations. Follow-up #10775 NO_TEST=internal helper NO_CHANGELOG=internal NO_DOC=internal
Memtx MVCC does not support multikey indexes, but allows to use them. If one will use them, Tarantool is most likely to crash, and that's the best scenario - otherwise Tarantool will work, but something terribly wrong will be happening under the hood. Let's simply forbid to create such indexes along with MVCC until we support them. Workaround for #6385 NO_CHANGELOG=later NO_DOC=later
The commit adds changelog entries and doc request for the patchset adding support of functional indexes to memtx MVCC. The first changelog entry is about functional indexes themselves. All the issues are considered as bugs and crashes, but since functional indexes were not supported at all, we consider the patch as the feature, so the changelog entry does not reference any issue. The second changelog is about forbidding multikey indexes with MVCC - we should have done it much earlier. NO_TEST=no code changes @TarantoolBot document Title: Memtx MVCC unsupported index types Tarantool with enabled memtx MVCC actually does not support multikey indexes. Since Tarantool 3.4, an error will be thrown is one tries to create such index. In earlier versions, it will lead to a crash or to a panic - we should document it. Note that if one wants to enable MVCC but it has multikey indexes, it has to drop it and do a snapshot so that the definition of the index and data inserted to it will disappear from snapshot and WAL completely. Also, Tarantool with enabled memtx MVCC didn't support functional indexes - it would lead to a crash as well, we should also document it. Since Tarantool 3.4 functional indexes are supported (but only singe-key ones).
|
Another memory leak - we shouldn't save functional keys for indexes that are being built. diff --git a/src/box/memtx_tx.c b/src/box/memtx_tx.c
index c7b4d70ba7..a3752062fe 100644
--- a/src/box/memtx_tx.c
+++ b/src/box/memtx_tx.c
@@ -39,6 +39,7 @@
#include "key_list.h"
#include "schema_def.h"
#include "small/mempool.h"
+#include "space_cache.h"
enum {
/**
@@ -806,9 +807,19 @@ memtx_tx_save_func_key(struct tuple *tuple, struct index *index,
struct tuple *func_key)
{
assert(func_key != NULL);
- if (memtx_tx_manager_use_mvcc_engine &&
- tuple_has_flag(tuple, TUPLE_IS_DIRTY))
- memtx_tx_tuple_func_key_impl(tuple, index, func_key);
+ if (!memtx_tx_manager_use_mvcc_engine ||
+ !tuple_has_flag(tuple, TUPLE_IS_DIRTY))
+ return;
+ /*
+ * Since the stories account only committed schema, we shouldn't
+ * save keys for indexes that are being built. We check that by
+ * looking at the space cache.
+ */
+ struct space *space = space_by_id(index->def->space_id);
+ if (space == NULL || index->dense_id >= space->index_count ||
+ space->index[index->dense_id] != index)
+ return;
+ memtx_tx_tuple_func_key_impl(tuple, index, func_key);
}
/**@CuriousGeorgiy PTAL |
The PR adds support of functional indexes in MVCC. Before the patch, they were silently broken - one could create them, but they wouldn't work as expected and might even crash.
Along the way, the commit reworks MVCC GC a bit. GC itself left intact - but it's usage was revised.
Closes #7867
Closes #10770
Closes #10771
Closes #10775
Closes #11099
Closes #11100