cmd/docker-proxy: UDP: fix race & aggressive GC#49649
cmd/docker-proxy: UDP: fix race & aggressive GC#49649akerouanton merged 3 commits intomoby:masterfrom
Conversation
80477b2 to
3d35538
Compare
robmry
left a comment
There was a problem hiding this comment.
Looks good I think - just checking my understanding ...
cmd/docker-proxy/udp_proxy_linux.go
Outdated
| } | ||
| cte.mu.Lock() | ||
| proxy.connTrackLock.Unlock() | ||
| cte.conn.SetWriteDeadline(time.Now().Add(UDPConnTrackTimeout)) |
There was a problem hiding this comment.
Looks like the write deadline is new - is it meant to be in the first commit (fixing the write/close race)?
If it is, and there's a concern about a write blocking for such a long time, should the reply-loop's write have a deadline too? (Maybe I've missed the point?!)
There was a problem hiding this comment.
Before that change, if a Write was blocking for 90+ seconds, it'd be interrupted by the Close made by the replyLoop. Now that the Write and Close operations are serialized, there's no way Close can interrupt a blocking Write.
I'm pretty sure it's not needed, but I added this SetWriteDeadline out of caution to make sure we don't introduce a new failure mode. Obviously if a UDP write blocks for so long, there has to be something really bad going on.
cmd/docker-proxy/udp_proxy_linux.go
Outdated
| // and [net.Conn.Close] operations. | ||
| type connTrackEntry struct { | ||
| conn *net.UDPConn | ||
| // Never lock mu without locking UDPPorxy.connTrackLock first. |
There was a problem hiding this comment.
nits ...
Maybe worth explicitly saying the lock needs to be held for writes and close, but not for reads?
Typo Porxy, also serliazed in the commit comment.
| for _, conn := range proxy.connTrackTable { | ||
| conn.Close() | ||
| for _, cte := range proxy.connTrackTable { | ||
| cte.conn.Close() |
There was a problem hiding this comment.
I guess this doesn't bother with cte.conn.mu - because if there's a write happening it's just a shutdown race - and the issue you're fixing is about packets getting dropped during normal running? (Maybe worth a comment to explain that?)
There was a problem hiding this comment.
Done, and also updated Close's GoDoc.
cmd/docker-proxy/udp_proxy_linux.go
Outdated
| // If the UDP connection is one-sided (i.e. the backend never sends | ||
| // replies), the connTrackEntry should not be GC'd until no writes | ||
| // happen for proxy.connTrackTimeout. | ||
| if errors.Is(err, os.ErrDeadlineExceeded) && cte.lastWrite() < proxy.connTrackTimeout { |
There was a problem hiding this comment.
big nit: lastWrite sounds more like it would return a timestamp instead of duration.
Moving the time.Since out of the lastWrite would read a little bit nicer:
| if errors.Is(err, os.ErrDeadlineExceeded) && cte.lastWrite() < proxy.connTrackTimeout { | |
| if errors.Is(err, os.ErrDeadlineExceeded) && time.Since(cte.lastWrite()) < proxy.connTrackTimeout { |
The UDP proxy used by cmd/docker-proxy is executing Write and Close in two separate goroutines, such that a Close could interrupt an in-flight Write. Introduce a `connTrackEntry` that wraps a `net.Conn` and a `sync.Mutex` to ensure that Write and Close are serialized. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
Signed-off-by: Albin Kerouanton <albinker@gmail.com>
The UDP proxy is setting a deadline of 90 seconds when reading from the backend. If no data is received within this interval, it reclaims the connection. This means, the backend would see a different connection every 90 seconds if the backend never sends back any reply to a client. This change prevents the proxy from eagerly GC'ing such connections by taking into account the last time a datagram was proxyed to the backend. Signed-off-by: Albin Kerouanton <albinker@gmail.com>
3d35538 to
4276f33
Compare
- What I did
cmd/docker-proxy: UDP: thread-safe Write and Close
The UDP proxy used by cmd/docker-proxy is executing Write and Close in two separate goroutines, such that a Close could interrupt an in-flight Write.
Introduce a
connTrackEntrythat wraps anet.Connand async.Mutexto ensure that Write and Close are serliazed.cmd/docker-proxy: do not eagerly GC one-sided UDP conns
The UDP proxy is setting a deadline of 90 seconds when reading from the backend. If no data is received within this interval, it reclaims the connection.
This means, the backend would see a different connection every 90 seconds if the backend never sends back any reply to a client.
This change prevents the proxy from eagerly GC'ing such connections by taking into account the last time a datagram was proxyed to the backend.
- How I did it
- How to verify it
The race can't be reproduced easily. The 2nd fix has a unit test.
- Human readable description for the release notes