Skip to content

Federation: parallel shutdown; disconnect links before stopping (backport #15271)#15283

Merged
michaelklishin merged 12 commits intov4.2.xfrom
mergify/bp/v4.2.x/pr-15271
Jan 16, 2026
Merged

Federation: parallel shutdown; disconnect links before stopping (backport #15271)#15283
michaelklishin merged 12 commits intov4.2.xfrom
mergify/bp/v4.2.x/pr-15271

Conversation

@mergify
Copy link
Copy Markdown

@mergify mergify bot commented Jan 16, 2026

Technical design pair: @ansd.

Proposed Changes

This PR makes the most expensive part of federation link shutdown — closing AMQP 0-9-1 connections to the upstream — parallel, by notifying links in the prep_stop shutdown callback.

This yields very significant efficiency gains with hundreds or thousands of links, all without changing the supervisor structure.

Why Not Use simple_one_for_one?

Indeed the simple_one_for_one OTP supervisor restart strategy would shut down all child processes automatically for us. But this would require changing the child identity (key)
to an Erlang PID, which would require intrusive and painful to test changes (such as
an ETS table that maps their PIDS to the current identities and the other way around).

Throttling to Avoid Overwhelming the Upstream

To avoid overwhelming the upstream schema data store (which could be a 7-9 node cluster on 3.x with Mnesia), we limit the degree of parallelism and add batching with configurable throttling delays into the process.

The entire link shutdown process is now capped at 180 seconds (by default), and should not meaningfully exceed that time period even on nodes with many thousands of links.

By default we close up to 128 links per batch, with a 50 ms delay, and a 180 second hard cap (timeout) for the entire link termination operation.

Data Safety Considerations

Federation uses publisher confirms by default, and most users never change it, therefore
aggressive connection closures are safe and acceptable.

In addition, the user can set resource-cleanup-mode to never to make sure that the
upstream resources (e.g. internal queues used by exchange federation) are never deleted
by the links running in the downstream cluster.

Show Me The Benchmark Data

Microbenchmarks (Supervisor Child Process Termination)

Below are some microbenchmarks that measure everything beyond the actual
AMQP 0-9-1 connection termination part on an 8 core aarch64 CPU from 2022:

┌───────┬──────────┬────────────┬─────────┐
│ Links │ Parallel │ Sequential │ Speedup │
├───────┼──────────┼────────────┼─────────┤
│ 1,000 │ 16ms     │ 6,401ms    │ ~400x   │
├───────┼──────────┼────────────┼─────────┤
│ 5,000 │ 30ms     │ 32,689ms   │ ~1,000x │
└───────┴──────────┴────────────┴─────────┘

Worst Case Scenario Calculations

If we consider the worst case scenario where every link connection hits its timeout,
1K links would take about 83 minutes to start for the sequential (status quo) version and 5.6 seconds (see below) with these changes.

Real World Federation Links with Outgoing Connections

With the throttling delay of 0, the time it takes to shut down N links to a remote upstream
cluster look like this:

┌───────┬─────────────────┐
│ Links │      Time       │
├───────┼─────────────────┤
│ 10    │ 100ms           │
├───────┼─────────────────┤
│ 50    │ 580ms           │
├───────┼─────────────────┤
│ 100   │ 1,067ms         │
├───────┼─────────────────┤
│ 1,000 │ 5,579ms         │
└───────┴─────────────────┘

Maintenance Mode Integration

Maintenance mode integration of these changes needs to be done with care: since maintenance mode stops all client connection listeners, we run the risk of stopping the listeners
before this part of the federation shutdown has a chance to do its job as designed.

For that reason, we have to special case the federation plugins in the core and first trigger their termination, then stop the listeners.

When the node is revived (the maintenance mode is rolled back), all links are restarted.


This is an automatic backport of pull request #15271 done by Mergify.

michaelklishin and others added 10 commits January 16, 2026 17:26
This yields very significant efficiency gains
with hundreds or thousands of links.

To avoid overwhelming the upstream schema data store
(which could be a 7-9 node cluster on 3.x with Mnesia),
we limit the degree of parallelism and add configurable
throttling delays into the process.

Technical design pair: @ansd.

(cherry picked from commit 1ab7393)
without it, the new keys (or rather, their defaults) will spill into the `config_schema_SUITE`s of other plugins.

(cherry picked from commit 4ff6b2a)
We implement the `revive/0` part for symmetry. As with the revive command in general, it serves as a last resort available for rollback.

Usually nodes put into maintenance mode are shortly stopped for upgrading or reconfiguration.

(cherry picked from commit 283aa0e)
Previously, the following three supervisors used the wrong `shutdown`
and wrong `type`:
* rabbit_exchange_federation_sup
* rabbit_federation_sup
* rabbit_queue_federation_sup

For `shutdown` Erlang/OTP recommends:
"If the child process is another supervisor, the shutdown time must be
set to infinity to give the subtree ample time to shut down. Setting the
shutdown time to anything other than infinity for a child of type supervisor
can cause a race condition where the child in question unlinks its own children,
but fails to terminate them before it is killed."

For `type` Erlang/OTP recommends:
"type specifies if the child process is a supervisor or a worker.
The type key is optional. If it is not specified, it defaults to
worker."

This commit fixes the wrong child spec by using a timeout of `infinity`
and type `supervisor`.

(cherry picked from commit cfcf6cf)
(cherry picked from commit e40387e)
 ## What?

Federation links started in the federation plugins are put
under the `rabbit` app supervision tree (unfortunately).

This commit ensures that the entire federation supervision hierarchies
(including all federation links) are stopped **before** stopping app
`rabbit` when stopping RabbittMQ.

 ## Why?

Previously, we've seen cases where hundreds of federation links are
stopped during the shutdown procedure in app `rabbit` leading to
federation link restarts happening in parallel to vhosts being stopped.
In one case, the shutdown of app `rabbit` even got stuck (although there
is no evidence that federation was the problem).

Either way, the cleaner appraoch is to gracefully stop all federation
links, i.e. the entire supervision hierarchy under
`rabbit_exchange_federation_sup` and `rabbit_queue_federation_sup`
when stopping the federation apps, i.e. **before** proceeding to stop
app `rabbit`.

 ## How?

The boot step cleanup steps for the federation plugins are skipped when
stopping RabbitMQ.

Hence, this commit ensures that the supervisors are stopped in the
stop/1 application callback.

This commit does something similar to #14054
but uses a simpler approach.

(cherry picked from commit 8bffa58)
(cherry picked from commit 512553e)

# Conflicts:
#	deps/rabbitmq_federation_common/src/rabbit_federation_pg.erl
when the core now interacts with a part of the
supervision tree owned by this plugin for
more efficient shutdown.

(cherry picked from commit 807e186)
(cherry picked from commit 19bb842)
(cherry picked from commit 59e9f7a)
@mergify
Copy link
Copy Markdown
Author

mergify bot commented Jan 16, 2026

Cherry-pick of 512553e has failed:

On branch mergify/bp/v4.2.x/pr-15271
Your branch is ahead of 'origin/v4.2.x' by 6 commits.
  (use "git push" to publish your local commits)

You are currently cherry-picking commit 512553e09.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   deps/rabbitmq_exchange_federation/src/rabbit_exchange_federation_app.erl
	modified:   deps/rabbitmq_exchange_federation/src/rabbit_exchange_federation_sup.erl
	modified:   deps/rabbitmq_queue_federation/src/rabbit_queue_federation_app.erl
	modified:   deps/rabbitmq_queue_federation/src/rabbit_queue_federation_sup.erl

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   deps/rabbitmq_federation_common/src/rabbit_federation_pg.erl

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@michaelklishin michaelklishin added this to the 4.2.3 milestone Jan 16, 2026
@michaelklishin michaelklishin merged commit cc2d8d0 into v4.2.x Jan 16, 2026
576 of 577 checks passed
@michaelklishin michaelklishin deleted the mergify/bp/v4.2.x/pr-15271 branch January 16, 2026 20:00
@michaelklishin
Copy link
Copy Markdown
Collaborator

Will backport to v4.1.x manually because the federation plugins structure is different there (it's a single plugin).

@michaelklishin michaelklishin changed the title Federation: disconnect links before stopping, in parallel (backport #15271) Federation: parallel shutdown; disconnect links before stopping (backport #15271) Jan 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants