Skip to content

cmd/docker-proxy: UDP: fix race & aggressive GC#49649

Merged
akerouanton merged 3 commits intomoby:masterfrom
akerouanton:proxy-concurrent-write-close
Mar 18, 2025
Merged

cmd/docker-proxy: UDP: fix race & aggressive GC#49649
akerouanton merged 3 commits intomoby:masterfrom
akerouanton:proxy-concurrent-write-close

Conversation

@akerouanton
Copy link
Member

@akerouanton akerouanton commented Mar 17, 2025

- What I did

cmd/docker-proxy: UDP: thread-safe Write and Close

The UDP proxy used by cmd/docker-proxy is executing Write and Close in two separate goroutines, such that a Close could interrupt an in-flight Write.

Introduce a connTrackEntry that wraps a net.Conn and a sync.Mutex to ensure that Write and Close are serliazed.

cmd/docker-proxy: do not eagerly GC one-sided UDP conns

The UDP proxy is setting a deadline of 90 seconds when reading from the backend. If no data is received within this interval, it reclaims the connection.

This means, the backend would see a different connection every 90 seconds if the backend never sends back any reply to a client.

This change prevents the proxy from eagerly GC'ing such connections by taking into account the last time a datagram was proxyed to the backend.

- How I did it

- How to verify it

The race can't be reproduced easily. The 2nd fix has a unit test.

- Human readable description for the release notes

- Fix a bug that could cause `docker-proxy` to stop forwarding UDP datagrams to containers
- Fix a bug that was causing `docker-proxy` to close UDP connections to containers eagerly and resulting in the source address to change needlessly

Copy link
Contributor

@robmry robmry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good I think - just checking my understanding ...

}
cte.mu.Lock()
proxy.connTrackLock.Unlock()
cte.conn.SetWriteDeadline(time.Now().Add(UDPConnTrackTimeout))
Copy link
Contributor

@robmry robmry Mar 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the write deadline is new - is it meant to be in the first commit (fixing the write/close race)?

If it is, and there's a concern about a write blocking for such a long time, should the reply-loop's write have a deadline too? (Maybe I've missed the point?!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before that change, if a Write was blocking for 90+ seconds, it'd be interrupted by the Close made by the replyLoop. Now that the Write and Close operations are serialized, there's no way Close can interrupt a blocking Write.

I'm pretty sure it's not needed, but I added this SetWriteDeadline out of caution to make sure we don't introduce a new failure mode. Obviously if a UDP write blocks for so long, there has to be something really bad going on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, ta.

// and [net.Conn.Close] operations.
type connTrackEntry struct {
conn *net.UDPConn
// Never lock mu without locking UDPPorxy.connTrackLock first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits ...

Maybe worth explicitly saying the lock needs to be held for writes and close, but not for reads?

Typo Porxy, also serliazed in the commit comment.

for _, conn := range proxy.connTrackTable {
conn.Close()
for _, cte := range proxy.connTrackTable {
cte.conn.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this doesn't bother with cte.conn.mu - because if there's a write happening it's just a shutdown race - and the issue you're fixing is about packets getting dropped during normal running? (Maybe worth a comment to explain that?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and also updated Close's GoDoc.

Copy link
Contributor

@vvoland vvoland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// If the UDP connection is one-sided (i.e. the backend never sends
// replies), the connTrackEntry should not be GC'd until no writes
// happen for proxy.connTrackTimeout.
if errors.Is(err, os.ErrDeadlineExceeded) && cte.lastWrite() < proxy.connTrackTimeout {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big nit: lastWrite sounds more like it would return a timestamp instead of duration.

Moving the time.Since out of the lastWrite would read a little bit nicer:

Suggested change
if errors.Is(err, os.ErrDeadlineExceeded) && cte.lastWrite() < proxy.connTrackTimeout {
if errors.Is(err, os.ErrDeadlineExceeded) && time.Since(cte.lastWrite()) < proxy.connTrackTimeout {

The UDP proxy used by cmd/docker-proxy is executing Write and Close in
two separate goroutines, such that a Close could interrupt an in-flight
Write.

Introduce a `connTrackEntry` that wraps a `net.Conn` and a `sync.Mutex`
to ensure that Write and Close are serialized.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The UDP proxy is setting a deadline of 90 seconds when reading from the
backend. If no data is received within this interval, it reclaims the
connection.

This means, the backend would see a different connection every 90
seconds if the backend never sends back any reply to a client.

This change prevents the proxy from eagerly GC'ing such connections by
taking into account the last time a datagram was proxyed to the backend.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
@akerouanton akerouanton force-pushed the proxy-concurrent-write-close branch from 3d35538 to 4276f33 Compare March 17, 2025 16:54
@vvoland vvoland added this to the 28.0.2 milestone Mar 17, 2025
@akerouanton akerouanton self-assigned this Mar 17, 2025
@akerouanton akerouanton merged commit 7cdd1b5 into moby:master Mar 18, 2025
149 checks passed
@akerouanton akerouanton deleted the proxy-concurrent-write-close branch March 18, 2025 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants