Skip to content

Await for consensus before unproxifying#7558

Merged
timvisee merged 1 commit intodevfrom
safer-init-shard-transfer
Nov 19, 2025
Merged

Await for consensus before unproxifying#7558
timvisee merged 1 commit intodevfrom
safer-init-shard-transfer

Conversation

@generall
Copy link
Member

We observed the Unwrapping proxy shard log message shortly followed by consensus error in production.

There are several problems with that:

  • Current logic doesn't prevent Unwrapping proxy shard in case if the consensus is significantly delayed on receiver. It was checking for sahrd to be local, but didn't check if the relevant transfer status was received.
  • There were no guarantee, that no other consensus operations should have been executed after Unwrapping proxy shard.
  • There is no clear way to reproduce original consensus error, but having Unwrapping proxy shard in the out-of-order operation is suspecious enough for this PR

What this PR does:

  • instead of unproxifying and then checking for consensus status, we first await that the receiver accepts consensus operation with the exact shard transfer we are waiting for.
  • Only after awaiting for consensus we do checks for local shards, including the check for Unwrapping proxy shard
  • Unwrapping proxy shard is promoted to an error, as we should not see it anymore, assuming the logic is not compromised somewhere else.

@generall generall requested review from agourlay and ffuugoo November 18, 2025 14:04
)));
};

if replica_set.is_proxy().await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that we use is_proxy() now, which is much more clear.

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not run (extensive) tests on this locally. I trust our CI and chaos testing for this one.

@timvisee timvisee merged commit 113e8aa into dev Nov 19, 2025
16 checks passed
@timvisee timvisee deleted the safer-init-shard-transfer branch November 19, 2025 13:14
@timvisee timvisee mentioned this pull request Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants