Skip to content

Fix stream SAC coordinator deadlock when deactivating consumer disconnects#15353

Merged
kjnilsson merged 1 commit intomainfrom
stream-sac-coordinator-deactivating-consumer-deadlock
Jan 28, 2026
Merged

Fix stream SAC coordinator deadlock when deactivating consumer disconnects#15353
kjnilsson merged 1 commit intomainfrom
stream-sac-coordinator-deactivating-consumer-deadlock

Conversation

@acogoluegnes
Copy link
Copy Markdown
Contributor

@acogoluegnes acogoluegnes commented Jan 27, 2026

When a consumer in {connected, deactivating} state had its node
disconnect, it would transition to {disconnected, deactivating}.
This state blocked rebalancing because is_active/1 returned true,
leaving the group with no active consumer indefinitely.

Changes:

  • handle_group_after_connection_node_disconnected now transitions
    {connected, deactivating} to {disconnected, waiting} and triggers
    rebalancing, since the deactivation handshake cannot complete
  • handle_group_connection_presumed_down applies the same fix as
    defense in depth for {disconnected, deactivating} consumers
  • handle_connection_node_disconnected updated to accumulate effects
    from group processing
  • bump stream coordinator machine version to 6 and handle backward
    compatibility

Added test cases covering:

  • Node disconnect during deactivation (simple and super streams)
  • Connection down during deactivation
  • Multiple consumers from same connection with one deactivating
  • Reconnection after disconnect during deactivation
  • Presume down with deactivating consumer (simple and super streams)

References rabbitmq/rabbitmq-stream-dotnet-client#447

@acogoluegnes acogoluegnes added this to the 4.3.0 milestone Jan 27, 2026
@acogoluegnes acogoluegnes force-pushed the stream-sac-coordinator-deactivating-consumer-deadlock branch from b457865 to de2ea02 Compare January 27, 2026 17:19
…nects

When a consumer in {connected, deactivating} state had its node
disconnect, it would transition to {disconnected, deactivating}.
This state blocked rebalancing because is_active/1 returned true,
leaving the group with no active consumer indefinitely.

Changes:
- handle_group_after_connection_node_disconnected now transitions
  {connected, deactivating} to {disconnected, waiting} and triggers
  rebalancing, since the deactivation handshake cannot complete
- handle_group_connection_presumed_down applies the same fix as
  defense in depth for {disconnected, deactivating} consumers
- handle_connection_node_disconnected updated to accumulate effects
  from group processing
- bump stream coordinator machine version to 6 and handle backward
  compatibility

Added test cases covering:
- Node disconnect during deactivation (simple and super streams)
- Connection down during deactivation
- Multiple consumers from same connection with one deactivating
- Reconnection after disconnect during deactivation
- Presume down with deactivating consumer (simple and super streams)

References rabbitmq/rabbitmq-stream-dotnet-client#447
@acogoluegnes acogoluegnes force-pushed the stream-sac-coordinator-deactivating-consumer-deadlock branch from de2ea02 to cd87226 Compare January 28, 2026 09:21
@acogoluegnes acogoluegnes marked this pull request as ready for review January 28, 2026 13:13
@kjnilsson
Copy link
Copy Markdown
Contributor

NB this cannot be backported to 4.1.x as the next series 4.2 is already released.

@kjnilsson kjnilsson merged commit adab2b1 into main Jan 28, 2026
575 of 577 checks passed
@kjnilsson kjnilsson deleted the stream-sac-coordinator-deactivating-consumer-deadlock branch January 28, 2026 13:57
acogoluegnes added a commit that referenced this pull request Jan 28, 2026
Fix stream SAC coordinator deadlock when deactivating consumer disconnects (backport #15353)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants