Skip to content

WAL does not respect rollback order when its queue is full #11179

@drewdzzz

Description

@drewdzzz

Description

When rollback happens, all the transactions must be rolled back in reverse order of their preparation. And WAL handles it - it sends to TX its queue in reversed order on rollback. However, a problem happens when WAL queue is full and there are some transactions waiting for a space to join the queue - these transactions must be rolled back before ones from WAL queue (and strictly in reversed order), but they are not.

That's a very serious problem that has caused several fuzzing crashes.
The issue definitely causes #10802 and #10082 (the second one can also be triggered by other problems, though).
Potential cause of https://github.com/tarantool/tarantool-ee/issues/999 and #10283.

Reproducer

local fiber = require('fiber')
local os = require('os')
-- Cleanup
os.execute('rm 000*')

-- Start Tarantool with limited WAL queue.
box.cfg{wal_queue_max_size = 1024}

s = box.schema.space.create('test')
s:create_index('pk')

-- Set WAL delay and do a bunch of transactions replacing the same key.
-- Note that some transactions won't enqueue WAL because of the size limitation.
box.error.injection.set('ERRINJ_WAL_DELAY', true)
for i = 1, 1500 do
    fiber.create(function()
        s:replace{1, i}
        -- s:truncate()
    end)
end

-- Inject an error and wait for crash.
box.error.injection.set('ERRINJ_WAL_DELAY', false)
box.error.injection.set('ERRINJ_WAL_WRITE', true)

fiber.sleep(1)

Output:

index.cc:232 E> ER_TUPLE_FOUND: Duplicate key exists in unique index "pk" in space "test" with old tuple - [1, 1500] and new tuple - [1, 16]
Assertion failed: (0), function memtx_engine_rollback_statement, file memtx_engine.cc, line 696.

If you comment out s:replace{1} and uncomment s:truncate(), you will face another crash:

Assertion failed: (old_space_by_id == old_space), function space_cache_replace, file space_cache.c, line 149.

Metadata

Metadata

Assignees

Labels

2.11Target is 2.11 and all newer release/master branches3.2Target is 3.2 and all newer release/master branches3.3Target is 3.3 and all newer release/master branchesbugSomething isn't workingcrash

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions