libn/networkdb: stop forging tombstone entries by corhere · Pull Request #50342 · moby/moby

corhere · 2025-07-07T16:34:04Z

- What I did

Fixed NetworkDB does not always reliably converge #47728

- How I did it
When a node leaves a network, all entries owned by that node are implicitly deleted. The other NetworkDB nodes handle the leave by setting the deleted flag on the entries owned by the left node in their local stores. This behaviour is problematic as it results in two conflicting entries with the same Lamport timestamp propagating through the cluster.

Consider two NetworkDB nodes, A, and B, which are both joined to some network. Node A in quick succession leaves the network, immediately rejoins it, then creates an entry. If Node B processes the entry-creation event first, it will add the entry to its local store then set the deleted flag upon processing the network-leave. No matter how many times B bulk-syncs with A, B will ignore the live entry for having the same timestamp as its local tombstone entry. Once this situation occurs, the only way to recover is for the entry to get updated by A with a new timestamp.

There is no need for a node to store forged tombstones for another node's entries. All nodes will purge the entries naturally when they process the network-leave or node-leave event. Simply delete the non-owned entries from the local store so there is no inconsistent state to interfere with convergence when nodes rejoin a network. Have nodes update their local store with tombstones for entries when leaving a network so that after a rapid leave-then-rejoin the entry deletions propagate to nodes which may have missed the leave event.

- How to verify it
With a new unit test.

- Human readable description for the release notes

- Fix a bug in NetworkDB which would sometimes cause entries to get stuck deleted on some of the nodes, leading to connectivity issues between containers on overlay networks.

- A picture of a cute animal (not mandatory but encouraged)

When a node leaves a network, all entries owned by that node are implicitly deleted. The other NetworkDB nodes handle the leave by setting the deleted flag on the entries owned by the left node in their local stores. This behaviour is problematic as it results in two conflicting entries with the same Lamport timestamp propagating through the cluster. Consider two NetworkDB nodes, A, and B, which are both joined to some network. Node A in quick succession leaves the network, immediately rejoins it, then creates an entry. If Node B processes the entry-creation event first, it will add the entry to its local store then set the deleted flag upon processing the network-leave. No matter how many times B bulk-syncs with A, B will ignore the live entry for having the same timestamp as its local tombstone entry. Once this situation occurs, the only way to recover is for the entry to get updated by A with a new timestamp. There is no need for a node to store forged tombstones for another node's entries. All nodes will purge the entries naturally when they process the network-leave or node-leave event. Simply delete the non-owned entries from the local store so there is no inconsistent state to interfere with convergence when nodes rejoin a network. Have nodes update their local store with tombstones for entries when leaving a network so that after a rapid leave-then-rejoin the entry deletions propagate to nodes which may have missed the leave event. Signed-off-by: Cory Snider <csnider@mirantis.com>

Copilot

Pull Request Overview

Improves NetworkDB’s handling of node leaves by deleting non-owned entries instead of forging tombstones and adds a regression test for out-of-order leave/rejoin events.

Switches deleteNodeNetworkEntries and deleteNodeTableEntries to remove remote-owned entries outright and only mark local entries for deletion.
Updates LeaveNetwork to inline the new deletion logic, removing obsolete tombstone-forging behavior.
Adds TestLeaveRejoinOutOfOrder to verify entries don’t get stuck deleted when leave/join events and table events arrive interleaved.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
tableevent_test.go	New regression test for leave/join/table-event order
networkdb.go	Overhauled entry-deletion logic on node/network leave
delegate.go	Simplified delegate to call updated deletion logic

Comments suppressed due to low confidence (3)

libnetwork/networkdb/networkdb.go:517

The doc comment should be updated to reflect that this now only deletes entries owned by the specified node, rather than handling local vs remote branches.

func (nDB *NetworkDB) deleteNodeNetworkEntries(nid, node string) {

libnetwork/networkdb/networkdb.go:540

[nitpick] The parameter node is ambiguous; consider renaming it to ownerNode or targetNode to clarify that it refers to the entry owner.

func (nDB *NetworkDB) deleteNodeTableEntries(node string) {

libnetwork/networkdb/tableevent_test.go:227

The new test covers remote entry creation out of order; consider adding a complementary test to verify that local entries are correctly tombstoned and propagated when the local node leaves and rejoins.

func TestLeaveRejoinOutOfOrder(t *testing.T) {

libnetwork/networkdb/networkdb.go

robmry

LGTM - not merging as it's maintainer spotlighted, maybe there's something more to discuss later today?

corhere added this to the 29.0.0 milestone Jul 7, 2025

corhere requested review from akerouanton, Copilot, robmry, thaJeztah and vvoland July 7, 2025 16:34

corhere added area/networking Networking process/cherry-pick area/swarm kind/bugfix PR's that fix bugs area/networking/d/overlay Networking process/cherry-pick/25.0 process/cherry-pick/28.x impact/changelog labels Jul 7, 2025

Copilot AI reviewed Jul 7, 2025

View reviewed changes

libnetwork/networkdb/networkdb.go Show resolved Hide resolved

libnetwork/networkdb/networkdb.go Show resolved Hide resolved

corhere added this to 🔦 Maintainer spotlight Jul 8, 2025

github-project-automation bot moved this to New in 🔦 Maintainer spotlight Jul 8, 2025

akerouanton approved these changes Jul 8, 2025

View reviewed changes

robmry approved these changes Jul 10, 2025

View reviewed changes

corhere removed this from 🔦 Maintainer spotlight Jul 10, 2025

corhere merged commit 0059929 into moby:master Jul 10, 2025
176 of 184 checks passed

corhere deleted the libn/fix-networkdb-tombstone-bug branch July 10, 2025 17:04

corhere mentioned this pull request Jul 25, 2025

[25.0] libnetwork/networkdb: backport all the fixes #50511

Merged

dmcgowan mentioned this pull request Aug 6, 2025

Prepare release notes for v2.0.0-alpha.0 #50651

Closed

vvoland removed the process/cherry-pick/28.x label Aug 13, 2025

dmcgowan mentioned this pull request Sep 5, 2025

Prepare release notes for v2.0.0-beta.0 #50918

Merged

YuryHrytsuk mentioned this pull request Feb 2, 2026

(Bulk) sync networks race condition leads to bulk sync timeouts (dns resolution is delayed) #51701

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libn/networkdb: stop forging tombstone entries#50342

libn/networkdb: stop forging tombstone entries#50342
corhere merged 1 commit intomoby:masterfrom
corhere:libn/fix-networkdb-tombstone-bug

corhere commented Jul 7, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

robmry left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

corhere commented Jul 7, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

robmry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants