Skip to content

raft: fix box.ctl.promote hang/crash#11560

Merged
sergepetrenko merged 4 commits intotarantool:masterfrom
mrForza:mrforza/gh-10836-hang-in-box-ctl-promote
Jul 22, 2025
Merged

raft: fix box.ctl.promote hang/crash#11560
sergepetrenko merged 4 commits intotarantool:masterfrom
mrForza:mrforza/gh-10836-hang-in-box-ctl-promote

Conversation

@mrForza
Copy link
Contributor

@mrForza mrForza commented Jun 2, 2025

Before this patch box.ctl.promote could lead to hang. It was happening in
the situation when one candidate server during the promote got a message
from follower that leader was already seen.

This particular scenario can reproduce box.ctl.promote hang:

  1. We have a replicaset with 3 servers.
  2. server1 loses an upstream connection with server3.
  3. Then we call box.ctl.promote on server1. The term of server1
    is higher than term of other servers.
  4. When server1 sends broadcast messages to other servers, server3
    starts election process due to election timeout and becomes a leader. Now,
    terms of all servers are equal.
  5. During the promote process, server1 gets a message from follower
    node - server2 that leader - server3 was already seen.

The hang happened because the raft state machine did not handle
a scenario when the candidate server during election process got a
message from follower that leader was already seen. It is important
to note that candidate server should not have a connection to newly
elected leader. As a result the candidate server continues to
believe that there is no leader and sends election broadcast
messages to other servers.

Since this kind of promote does not disrupt the cluster it is
decided not to change a low level raft implementation. In order to
avoid a hang in user's fiber we introduce a timeout for box.ctl.promote,
after which the control is transferred back to the user. The timeout for
promote is set to election_timeout. At the end of this time the
TimedOut error will be raised.

We also change the order of error's checking in box_raft_try_promote
to keep them in order of priority. It can help us to get rid of an excess
constant in promote timeout and make it equals to election_timeout.

Closes #10836

NO_DOC=bugfix

@mrForza mrForza requested a review from a team as a code owner June 2, 2025 10:07
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch 3 times, most recently from a888515 to 1778ed0 Compare June 2, 2025 15:57
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch 4 times, most recently from 6b2b6ed to 002b6dc Compare June 9, 2025 13:18
@coveralls
Copy link

coveralls commented Jun 9, 2025

Coverage Status

coverage: 87.57% (-0.004%) from 87.574%
when pulling 8fa68d4 on mrForza:mrforza/gh-10836-hang-in-box-ctl-promote
into e6ec4aa
on tarantool:master
.

Copy link
Contributor

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for figuring this out! Great work)

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Jun 10, 2025
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch from 002b6dc to 0fd9744 Compare June 11, 2025 09:14
@mrForza mrForza changed the title raft: fix box.ctl.promote hang raft: fix box.ctl.promote hang/crash Jun 11, 2025
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch from 87b1fb5 to 45d482d Compare June 11, 2025 16:08
@mrForza mrForza assigned Serpentian and Gerold103 and unassigned mrForza Jun 11, 2025
@mrForza mrForza requested review from Gerold103 and Serpentian June 11, 2025 17:09
Copy link
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes! Nice, I like that it didn't require changing the basic lib raft. And the commit messages are very clear.

@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch from 45d482d to bb80783 Compare June 16, 2025 13:43
Copy link
Contributor

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're almost done here) Let's discuss the final comments

@Serpentian Serpentian assigned mrForza and unassigned Serpentian Jun 16, 2025
Copy link
Collaborator

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch!
Let's fix a couple of nits in the tests and we'll be good to go.

mrForza added a commit to mrForza/tarantool that referenced this pull request Jul 8, 2025
After the patch tarantool#11560 the `tarantoolgh-3055-promote-wakeup-crash` test becomes
unnecessary. It does not check anymore the assertion after spurious
wakeup because the `ER_NO_ELECTION_QUORUM will` be raised earlier
(number of nodes in replicaset < `quorum`). Let's drop this test.

Part of tarantool#10836

NO_DOC=test

NO_CHANGELOG=test
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch from f458383 to 97fabcf Compare July 8, 2025 15:24
@mrForza mrForza requested a review from sergepetrenko July 9, 2025 07:28
@mrForza mrForza assigned sergepetrenko and unassigned mrForza Jul 9, 2025
Copy link
Collaborator

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the test's still flaky:

[009] not ok 1	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_promote_not_hangs_during_non_leader_message_about_leader
[009] #   ...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:114: expected: 4, actual: 5
[009] #   stack traceback:
[009] #   	...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:114: in function 'replication-luatest.gh_10836_hang_in_box_ctl_promote.test_promote_not_hangs_during_non_leader_message_about_leader'
[009] #   artifacts:
[009] #   	server1 -> /private/tmp/t/009_replication-luatest/artifacts/rs-pkmBfpHeUWnn/server1-HOcSOMoG9klr
[009] #   	server2 -> /private/tmp/t/009_replication-luatest/artifacts/rs-pkmBfpHeUWnn/server2-2X_ZTCWn12B-
[009] #   	server3 -> /private/tmp/t/009_replication-luatest/artifacts/rs-pkmBfpHeUWnn/server3-VkYs0lK7kixt
[009] ok     2	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_node_not_wait_promote_timeout_after_fiber_death

Let's fix it like this:

diff --git a/test/replication-luatest/gh_10836_hang_in_box_ctl_promote_test.lua b/test/replication-luatest/gh_10836_hang_in_box_ctl_promote_test.lua
index b5351ae490..659038ac18 100644
--- a/test/replication-luatest/gh_10836_hang_in_box_ctl_promote_test.lua
+++ b/test/replication-luatest/gh_10836_hang_in_box_ctl_promote_test.lua
@@ -111,8 +111,10 @@ g.test_promote_not_hangs_during_non_leader_message_about_leader = function()
         _G.server1_promote:set_joinable(true)
     end)
 
-    t.assert_equals(g.server1:get_election_term(),
-                    g.server2:get_election_term())
+    t.helpers.retrying({}, function()
+        t.assert_equals(g.server1:get_election_term(),
+                        g.server2:get_election_term())
+    end)
 
     g.proxy_2_to_1:resume()
     g.proxy_1_to_2:resume()

mrForza added a commit to mrForza/tarantool that referenced this pull request Jul 14, 2025
After the patch tarantool#11560 the `tarantoolgh-3055-promote-wakeup-crash` test becomes
unnecessary. It does not check anymore the assertion after spurious
wakeup because the `ER_NO_ELECTION_QUORUM will` be raised earlier
(number of nodes in replicaset < `quorum`). Let's drop this test.

Part of tarantool#10836

NO_DOC=test

NO_CHANGELOG=test
@mrForza mrForza force-pushed the mrforza/gh-10836-hang-in-box-ctl-promote branch from 97fabcf to ed99fc5 Compare July 14, 2025 08:49
Copy link
Collaborator

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, Roman, but the tests are still flaky. We should fix them before merging, or otherwise they'll irritate other developers and mask the real problems in CI, especially on weaker machines, like our aarch64 runners.

Here's how I test for flakiness (the -j flag might be different for you. I have 16 cores, so 32 jobs in parallel works best for me):

./test-run.py $(yes 10836_hang | head -n 512) -j32

I've seen at least 4 different issues, but have saved only 3 of them, so please fix them and make sure there are no other.
1:

[021] replication-luatest/gh_10836_hang_in_box_ctl_p>                                    [ fail ]
[021] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/gh_10836_hang_in_box_ctl_promote.reject:
[021] Tarantool version is 3.5.0-entrypoint-72-ged99fc5c73
[021] TAP version 13
[021] 1..2
[021] # Started on Tue Jul 15 15:12:19 2025
[021] # Starting group: replication-luatest.gh_10836_hang_in_box_ctl_promote
[021] not ok 1	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_promote_not_hangs_during_non_leader_message_about_leader
[021] #   ...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:81: expected: 3, actual: 2
[021] #   stack traceback:
[021] #   	...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:81: in function 'replication-luatest.gh_10836_hang_in_box_ctl_promote.test_promote_not_hangs_during_non_leader_message_about_leader'
[021] #   artifacts:
[021] #   	server3 -> /tmp/t/021_replication-luatest/artifacts/rs-qKGpcKrU_ARD/server3-sbI5o0sXh_S-
[021] #   	server1 -> /tmp/t/021_replication-luatest/artifacts/rs-qKGpcKrU_ARD/server1-xAf3ZnsVmnqi
[021] ok     2	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_node_not_wait_promote_timeout_after_fiber_death
[021] # Ran 2 tests in 2.241 seconds, 1 succeeded, 1 failed
[021] 

2:

[011] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/gh_10836_hang_in_box_ctl_promote.reject:
[011] Tarantool version is 3.5.0-entrypoint-72-ged99fc5c73
[011] TAP version 13
[011] 1..2
[011] # Started on Tue Jul 15 15:18:20 2025
[011] # Starting group: replication-luatest.gh_10836_hang_in_box_ctl_promote
[011] ok     1	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_promote_not_hangs_during_non_leader_message_about_leader
[011] not ok 2	replication-luatest.gh_10836_hang_in_box_ctl_promote.test_node_not_wait_promote_timeout_after_fiber_death
[011] #   ...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:147: expected: false or nil, actual: true
[011] #   stack traceback:
[011] #   	...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:147: in function <...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:134>
[011] #   	...cation-luatest/gh_10836_hang_in_box_ctl_promote_test.lua:134: in function 'replication-luatest.gh_10836_hang_in_box_ctl_promote.test_node_not_wait_promote_timeout_after_fiber_death'
[011] #   artifacts:
[011] #   	server3 -> /tmp/t/011_replication-luatest/artifacts/rs-aD-kD_FYjfVj/server3-TlitABbHamJ7
[011] #   	server2 -> /tmp/t/011_replication-luatest/artifacts/rs-aD-kD_FYjfVj/server2--oz3A9kqKvJB
[011] # Ran 2 tests in 3.635 seconds, 1 succeeded, 1 failed
[011] 

3:

[027] replication-luatest/gh_10836_crash_in_box_ctl_>                                    [ fail ]
[027] Test failed! Output from reject file /tmp/t/rejects/replication-luatest/gh_10836_crash_in_box_ctl_promote.reject:
[027] Tarantool version is 3.5.0-entrypoint-72-ged99fc5c73
[027] TAP version 13
[027] 1..1
[027] # Started on Tue Jul 15 15:20:52 2025
[027] # Starting group: replication-luatest.gh_10836_crash_in_box_ctl_promote
[027] not ok 1	replication-luatest.gh_10836_crash_in_box_ctl_promote.test_node_not_crashes_while_gaining_quorum_during_promote
[027] #   ...ation-luatest/gh_10836_crash_in_box_ctl_promote_test.lua:78: expected: nil, actual: timed out
[027] #   stack traceback:
[027] #   	...ation-luatest/gh_10836_crash_in_box_ctl_promote_test.lua:78: in function <...ation-luatest/gh_10836_crash_in_box_ctl_promote_test.lua:76>
[027] #   	...ation-luatest/gh_10836_crash_in_box_ctl_promote_test.lua:76: in function 'replication-luatest.gh_10836_crash_in_box_ctl_promote.test_node_not_crashes_while_gaining_quorum_during_promote'
[027] #   artifacts:
[027] #   	server1 -> /tmp/t/027_replication-luatest/artifacts/rs-k-_vDCsG_3oY/server1-PvL9_z8OLwlN
[027] #   	server3 -> /tmp/t/027_replication-luatest/artifacts/rs-k-_vDCsG_3oY/server3-EA3KaqFntxb8
[027] # Ran 1 tests in 7.385 seconds, 0 succeeded, 1 failed
[027] 

@mrForza
Copy link
Contributor Author

mrForza commented Jul 16, 2025

Since my laptop had only 22 available threads, I ran these two tests (gh_10836_hang and gh_10836_crash) in parallel with 22 workers 1024 times. The tests have never failed.

To fix flakies I

  1. Increased election_timeout param in order for the node to have more time to perform promote (send rpc requests and get responses). This change can help us to avoid TimedOut error in hang:93 and crash:79 tests.
  2. Increased replication_timeout param in order for the node to wait longer for responses from other servers during heartbeats. It can help us to avoid ER_NO_ELECTION_QUORUM error in hang:128, crash:79 tests. This scenario can occur when sockets of replicas die. We need more time to wait while the connection will be restored.
  3. Added retrying block for term assertion because in rare situations the information about leader's term may not be able to reach some servers in time. It cn help us to avoid failure in term's assertion in hang:81 test

I hope these issues are fixed!

Copy link
Collaborator

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roman, thanks for the fixes!

The tests are stable on my machine now.

@sergepetrenko
Copy link
Collaborator

@mrForza, the ASAN tests fail, PTAL.

@mrForza
Copy link
Contributor Author

mrForza commented Jul 18, 2025

Memory leaks occurred because we didn't clean the diag container if the box_raft_try_promote was ended successfully.


In this common scenario we could get a memory leak:

  1. We have a replicaset and we promote e.g. g.server_1.
  2. In some strange way the external error (e.g. ER_LOADING) is set to the box_raft_try_promote_f's diag container during promote operation. May be because of a bug in fibers.
  3. If the entire promote operation ends with no other promote-specific errors in callback box_raft_try_promote_f we move external error into ctx.diag anyway.
  4. Then we successfully exit from box_raft_try_promote (./src/box/raft.c:549) and forget to clean the external error from ctx.diag.
  5. Memory leak occurs.

The evidence of external error in box_raft_try_promote_f's diag:

2025-07-18 13:15:53.619 [313220] main/123/main raft.c:530 I> !  start box_raft_try_promote
2025-07-18 13:15:53.619 [313220] main/123/main raft.c:891 I> RAFT: bump term to 3, follow
2025-07-18 13:15:53.619 [313220] main/123/main raft.c:929 I> RAFT: vote for 2, follow
2025-07-18 13:15:53.620 [313220] main/119/raft_worker raft.c:785 I> RAFT: persisted state {term: 3, vote: 2}
2025-07-18 13:15:53.620 [313220] main/119/raft_worker raft.c:874 I> RAFT: enter candidate state with 1 self vote
2025-07-18 13:15:53.622 [313220] main/114/applier/unix/:/tmp/t/001_replication-luatest/rs-3UlIwDkhRZbU/server2.sock raft.c:507 I> RAFT: message {term: 3, state: follower} from 1
2025-07-18 13:15:53.622 [313220] main/114/applier/unix/:/tmp/t/001_replication-luatest/rs-3UlIwDkhRZbU/server2.sock raft.c:507 I> RAFT: message {term: 3, vote: 2, state: follower} from 1
2025-07-18 13:15:53.622 [313220] main/114/applier/unix/:/tmp/t/001_replication-luatest/rs-3UlIwDkhRZbU/server2.sock raft.c:844 I> RAFT: enter leader state with quorum 2
2025-07-18 13:15:53.622 [313220] main/114/applier/unix/:/tmp/t/001_replication-luatest/rs-3UlIwDkhRZbU/server2.sock raft.c:469 I> ! RAFT_STATE_LEADER
2025-07-18 13:15:53.622 [313220] main/114/applier/unix/:/tmp/t/001_replication-luatest/rs-3UlIwDkhRZbU/server2.sock raft.c:521 I> ! error in fiber's diag: 1
2025-07-18 13:15:53.622 [313220] main/123/main raft.c:557 I> !  error in ctx: 1
2025-07-18 13:15:53.622 [313220] main/123/main raft.c:559 I> !  end of promote (success); ctx.diag.last->code: 116
2025-07-18 13:15:53.622 [313220] main/119/raft_worker box.cc:445 I> box switched to rw

I add my custom logs in box_raft_try_promote_f and box_raft_try_promote in order to track the external error in diag.

The diff:

diff --git a/src/box/raft.c b/src/box/raft.c
index e0b0b65a0..0871b9348 100644
--- a/src/box/raft.c
+++ b/src/box/raft.c
@@ -465,8 +465,10 @@ box_raft_try_promote_f(struct trigger *trig, void *event)
 	assert(raft == box_raft());
 	struct box_raft_watch_ctx *ctx = trig->data;
 
-	if (raft->state == RAFT_STATE_LEADER)
+	if (raft->state == RAFT_STATE_LEADER) {
+		say_info("!\tRAFT_STATE_LEADER");
 		goto done;
+	}
 	/*
 	 * The checking for quorum loss is not placed inside !is_candidate
 	 * check because the quorum may be restored when raft->is_candidate
@@ -513,6 +515,7 @@ box_raft_try_promote_f(struct trigger *trig, void *event)
 	return 0;
 done:
 	ctx->is_done = true;
+	say_info("!\terror in fiber's diag: %d", diag_get()->last != NULL);
 	diag_move(diag_get(), &ctx->diag);
 	fiber_wakeup(ctx->owner);
 	return 0;
@@ -521,6 +524,7 @@ box_raft_try_promote_f(struct trigger *trig, void *event)
 int
 box_raft_try_promote(void)
 {
+	say_info("!\t");
 	struct raft *raft = box_raft();
 	assert(raft->is_enabled);
 	assert(box_election_mode == ELECTION_MODE_MANUAL ||
@@ -547,6 +551,9 @@ box_raft_try_promote(void)
 	trigger_clear(&trig);
 
 	if (raft->state == RAFT_STATE_LEADER) {
+		say_info("!\terror in ctx: %d", ctx.diag.last != NULL);
+		if (ctx.diag.last != NULL)
+			say_info("!\tend of promote (success); ctx.diag.last->code: ", ctx.diag.last->code);
 		/*
 		 * Since some errors from other functions may be set in
 		 * box_raft_try_promote_f's diag, we should always clear

mrForza added 4 commits July 21, 2025 11:32
Before this patch errors occurred in the `applier_f` weren't removed from
diag container after completing this function. These errors appeared in
other callbacks (e.g. `box_raft_try_promote_f`) which used the trigger
`box_raft_on_update` that was invoked during the execution of `applier_f`
nested functions. As a result, it led to memory leaks which were detected
in the patch tarantool#11560. To fix this let's clear diag container in the end of
the `applier_f`.

Needed for tarantool#10836

NO_TEST=<leak fix>
NO_DOC=<leak fix>
NO_CHANGELOG=<leak fix>
Before this patch `box.ctl.promote` could lead to hang. It was happening in
the situation when one candidate server during the promote got a message
from follower that leader was already seen.

This particular scenario can reproduce `box.ctl.promote` hang:
1) We have a replicaset with 3 servers.
2) `server1` loses an upstream connection with `server3`.
3) Then we call `box.ctl.promote` on `server1`. The term of `server1`
is higher than term of other servers.
4) When `server1` sends broadcast messages to other servers, `server3`
starts election process due to election timeout and becomes a leader. Now,
terms of all servers are equal.
5) During the promote process, `server1` gets a message from follower
node - `server2` that leader - `server3` was already seen.

The hang happened because the raft state machine did not handle
a scenario when the candidate server during election process got a
message from follower that leader was already seen. It is important
to note that candidate server should not have a connection to newly
elected leader. As a result the candidate server continues to
believe that there is no leader and sends election broadcast
messages to other servers.

Since this kind of promote does not disrupt the cluster it is
decided not to change a low level raft implementation. In order to
avoid a hang in user's fiber we introduce a timeout for `box.ctl.promote`,
after which the control is transferred back to the user. The timeout for
promote is set to `election_timeout`. At the end of this time the
`TimedOut` error will be raised.

We also change the order of error's checking in `box_raft_try_promote`
to keep them in order of priority. It can help us to get rid of an excess
constant in promote timeout and make it equals to `election_timeout`.

Part of tarantool#10836

NO_DOC=bugfix
Before this patch `box.ctl.promote` could lead to crash. It was happening
in the situation when one candidate server lost a quorum before promote
and gained it during promote. The reason of the crash was an opposite
state of `raft->is_candidate` flag. When node lost its quorum this flag
was set to false and trigger's function `box_raft_try_promote_f` ended
without `ctx->is_done`. After the node has gained its quorum by
reconnecting to other nodes, `raft_restore` has set `raft->is_candidate`
to true according to `election_mode`. As a result, all error checks in
`box_raft_try_promote` have failed and the assertion `!raft->is_candidate`
has fired.

To fix this crash we change the logic of errors' checking. Now, the
majority of errors are set inside `box_raft_try_promote_f` instead of
`box_raft_try_promote`. It can help us to track every critical events
(e.g. loss of quorum or reconfiguring) and exit immediately from callback
when it happened. As a result, if quorum loss is detected during promote,
the `ER_NO_ELECTION_QUORUM` will be raised instantly.

Since we change the core logic of error's checking, some replication tests
begin to fail:
1. The `election_split_vote` test fails for 2 reasons:
  - The first calling of box.ctl.promote failed because the replicaset
    doesn't have enough time to establish a connection between server1 and
    server2. As a result, the `ER_NO_ELECTION_QUORUM` is raised.
  - Incorrect error is raised in
    `test_election_off_demote_other_no_leader` because after
    `box.ctl.demote` calling we don't wait until the downstream connection
    changes its `follow` status.
2. The `gh_6860_election_off_demote` test fails because it hangs while
trying to grep log that split vote has been discovered. The reason of this
behavior is that when the replication between node1 and node2 breaks, the
`ER_NO_ELECTION_QUORUM` is raised and the `raft_check_split_vote` doesn't
work.

Now, we fix these tests by:
1. adding `wait_for_fullmesh` before all tests and waiting until the
status of downstream connection of server1 will not be `follow`.
2. introducing proxies between node1 and node2. It can help us not to get
the error about quorum loss and force replicaset into split vote state.

We also change the `tarantoolgh-3055-promote-wakeup-crash` test because after our
patch in this scenario the `ER_NO_ELECTION_QUORUM` will be raised always
due to quorum loss. There is no way the crash described in tarantoolgh-3055 can
happen because in the first iteration of `box_raft_try_promote_f` we exit
with error.

Closes tarantool#10836

NO_DOC=bugfix
After the patch tarantool#11560 the `tarantoolgh-3055-promote-wakeup-crash` test becomes
unnecessary. It does not check anymore the assertion after spurious
wakeup because the `ER_NO_ELECTION_QUORUM will` be raised earlier
(number of nodes in replicaset < `quorum`). Let's drop this test.

Part of tarantool#10836

NO_DOC=test
NO_CHANGELOG=test
@TarantoolBot
Copy link
Collaborator

Backport failed for release/3.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release/3.2
git worktree add -d .worktree/backport/release/3.2/11560 origin/release/3.2
cd .worktree/backport/release/3.2/11560
git switch --create backport/release/3.2/11560
git cherry-pick -x 482d3ccc153fba0b736bfd316d49befa43dbc7c6 02920c04155bffab0aab29d6dc520bdf3f220e16 d4f9c9c992bc145a3d0f0e8fea2e701ff71f126a d0242af36253b279f0fddd6887dfd89e679c175b

@TarantoolBot
Copy link
Collaborator

Backport failed for release/3.3, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release/3.3
git worktree add -d .worktree/backport/release/3.3/11560 origin/release/3.3
cd .worktree/backport/release/3.3/11560
git switch --create backport/release/3.3/11560
git cherry-pick -x 482d3ccc153fba0b736bfd316d49befa43dbc7c6 02920c04155bffab0aab29d6dc520bdf3f220e16 d4f9c9c992bc145a3d0f0e8fea2e701ff71f126a d0242af36253b279f0fddd6887dfd89e679c175b

@TarantoolBot
Copy link
Collaborator

Successfully created backport PR for release/3.4:

@TarantoolBot
Copy link
Collaborator

Backport summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/3.2 Automatically create a 3.2 backport PR backport/3.3 Automatically create a 3.3 backport PR backport/3.4 Automatically create a 3.4 backport PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hang/crash in box.ctl.promote()

7 participants