-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Description
This issue serves the purpose of tracking PSYNC2 test sporadic errors, and investigate them in order to fix the test timing issues yielding false positives, or looking closer to search for actual bugs. I'll update the top message with the currently active issues. Things are searched by grepping the log file of my CI setup, so I can have a very large history of test runs, and I'm able to examine if some failures are happening since the start or only recently.
Problem 1: Instance #3 x variable is inconsistent
This looks like is happening only recently:
$ grep 'variable is inconsistent' -R .
./run_174881.html:Instance #3 x variable is inconsistent
./run_175366.html:Instance #1 x variable is inconsistent
./run_174283.html:Instance #0 x variable is inconsistent
./run_174579.html:Instance #0 x variable is inconsistent
./run_171368.html:Instance #1 x variable is inconsistent
./run_171175.html:Instance #2 x variable is inconsistent
./run_171709.html:Instance #1 x variable is inconsistent
It's very rare and checking the first instance of it in run_171175, shows that the first commit to give such issue is the following: 365316a
Recent commits older than such commit are the following:
4447ddc8b Keep track of meaningful replication offset in replicas too
36ee294e8 PSYNC2: reset backlog_idx and master_repl_offset correctly
97f1c808c PSYNC2: fix backlog_idx when adjusting for meaningful offset
57fa355e5 PSYNC2: meaningful offset implemented.
So this may be a result of the implementation of the meaningful offset feature leading to some problem we yet do not know.
Note that the following commit was added after the first instance of this failure:
4447ddc8b Keep track of meaningful replication offset in replicas too
So if this failure is caused by the meaningful offset itself, is due to the meaningful offset implementation when the master switches to replica and trims its offset.