completion: fix memory leak in waitGroup#43953
Conversation
|
/test |
| func NewWaitGroup(ctx context.Context) (*WaitGroup, context.CancelFunc) { | ||
| ctx2, cancel := context.WithCancel(ctx) | ||
| return &WaitGroup{ctx: ctx2, cancel: cancel} | ||
| return &WaitGroup{ctx: ctx2, cancel: cancel}, cancel |
There was a problem hiding this comment.
An other alternative I'll look into is effectively doing something like a atomic.Bool, but not sure if its worth the rewrite tbh.
There was a problem hiding this comment.
Since this will need to be backported it is worth keeping the memory leak fix minimal and leave any cleanups, clarifications, etc. for another PR for the main. On this note, it would be a bit less invasive to not add a second return parameter here, but add a new Cancel() function instead, e.g.,
// Cancel releases resources held for the WaitGroup
func (wg *WaitGroup) Cancel() {
wg.cancel()
// Wait to clear completions, does not actually wait since we cancelled above
wg.Wait()
}
I'd also add an explicit wg.cancel() call to the end of Wait():
@@ -107,6 +119,7 @@ Loop:
}
}
wg.pendingCompletions = nil
+ wg.cancel()
return err
}
Then make use of the new Cancel() at prepareForProxyUpdates:
diff --git a/pkg/endpoint/regenerationcontext.go b/pkg/endpoint/regenerationcontext.go
index b90714d137..0bd1e6cef4 100644
--- a/pkg/endpoint/regenerationcontext.go
+++ b/pkg/endpoint/regenerationcontext.go
@@ -76,7 +76,11 @@ type datapathRegenerationContext struct {
func (ctx *datapathRegenerationContext) prepareForProxyUpdates(parentCtx context.Context) {
completionCtx, completionCancel := context.WithTimeout(parentCtx, EndpointGenerationTimeout)
- ctx.proxyWaitGroup = completion.NewWaitGroup(completionCtx)
+ wg := completion.NewWaitGroup(completionCtx)
+ ctx.proxyWaitGroup = wg
ctx.completionCtx = completionCtx
- ctx.completionCancel = completionCancel
+ ctx.completionCancel = func() {
+ completionCancel()
+ wg.Cancel()
+ }
}
There was a problem hiding this comment.
I'm mostly worried about ^ since it partially somewhat hides that this has to be closed to not leak memory. And the fact that people are not used to doing .Close() on a golang waitGroup. For backport its probably fine tho. Still not sure if that will really be easier to backport - since we still need to be very careful to not leak memory or to introduce correctness bugs. I can give it a stab in a separate PR and see how it works.
There was a problem hiding this comment.
I'm not that familiar with this library, but couldn't we just call cancel when we finish waiting (i.e. after we iterate all the pending completions and we set wg.pendingCompletions = nil) as a easy way to have a backportable fix?
9598c37 to
7a764f7
Compare
|
/test |
1 similar comment
|
/test |
|
Seems like mostly flakes. |
|
/ci-l3-l4 |
|
/ci-clustermesh |
|
Seems like #43949 broke the tests - so I'll have to rebase. |
|
Got busy with other work this week, so I'll try rebase and update the tests tomorrow! |
This waitGroup implementation creates a new WithCancel context for each wait group - but it never closes it. The actual behavior/leak depend on the underlying context - but no matter, its a leak. Its also pretty bad during a lot of churn. The amount of bytes/sec is pretty tiny, but over hours and hours of uptime, this can increase drastically.. Given we use context.WithCancel for the context passed in here, we end up adding a separate map entry to that parent context for each iteration. For the usages where the parent context is cancelled its fine - but thats not the case in all places, hence the leak. I've manually inspected the 'children' field of the 'cancelCtx' struct used in the 'UpdatePolicyMap' and can indeed see the length is increasing over time. Fixes: 551377f ("envoy: Make NACK cancel the WaitGroup") Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Odin Ugedal <ougedal@palantir.com>
The current api is a bit ambiguous around reuse of these wait groups. This commit tries to somewhat document the current behavior and create tests for it. As far as I can see there is no real reuse of these, so it should be mostly fine. We can change the API, but I think its fine for now to keep-as-is - since there might be some reason for it to work like this. Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Odin Ugedal <ougedal@palantir.com>
7a764f7 to
6f4ef74
Compare
|
/test |
This does seem like a flake.
This also. |
|
/ci-clustermesh |
|
/ci-l7 |
|
/ci-kpr |
|
Also cc @jrajahalme since you have been working on this stuff. |
|
/ci-kpr |
| func NewWaitGroup(ctx context.Context) (*WaitGroup, context.CancelFunc) { | ||
| ctx2, cancel := context.WithCancel(ctx) | ||
| return &WaitGroup{ctx: ctx2, cancel: cancel} | ||
| return &WaitGroup{ctx: ctx2, cancel: cancel}, cancel |
There was a problem hiding this comment.
Since this will need to be backported it is worth keeping the memory leak fix minimal and leave any cleanups, clarifications, etc. for another PR for the main. On this note, it would be a bit less invasive to not add a second return parameter here, but add a new Cancel() function instead, e.g.,
// Cancel releases resources held for the WaitGroup
func (wg *WaitGroup) Cancel() {
wg.cancel()
// Wait to clear completions, does not actually wait since we cancelled above
wg.Wait()
}
I'd also add an explicit wg.cancel() call to the end of Wait():
@@ -107,6 +119,7 @@ Loop:
}
}
wg.pendingCompletions = nil
+ wg.cancel()
return err
}
Then make use of the new Cancel() at prepareForProxyUpdates:
diff --git a/pkg/endpoint/regenerationcontext.go b/pkg/endpoint/regenerationcontext.go
index b90714d137..0bd1e6cef4 100644
--- a/pkg/endpoint/regenerationcontext.go
+++ b/pkg/endpoint/regenerationcontext.go
@@ -76,7 +76,11 @@ type datapathRegenerationContext struct {
func (ctx *datapathRegenerationContext) prepareForProxyUpdates(parentCtx context.Context) {
completionCtx, completionCancel := context.WithTimeout(parentCtx, EndpointGenerationTimeout)
- ctx.proxyWaitGroup = completion.NewWaitGroup(completionCtx)
+ wg := completion.NewWaitGroup(completionCtx)
+ ctx.proxyWaitGroup = wg
ctx.completionCtx = completionCtx
- ctx.completionCancel = completionCancel
+ ctx.completionCancel = func() {
+ completionCancel()
+ wg.Cancel()
+ }
}
| if err := mgr.waitForProxyCompletions(proxyWaitGroup); err != nil { | ||
| mgr.logger.Warn("Failed to apply L7 proxy policy changes. These will be re-applied in future updates.", logfields.Error, err) | ||
| } | ||
| cancel() |
There was a problem hiding this comment.
If we add a cancel() call to the WaitGroup.Wait() it does not need to be added here.
There was a problem hiding this comment.
wait, it is actually needed since waitForProxyCompletions() skips calling proxyWaitGroup.Wait() if proxyWaitGroup's context has errored:
diff --git a/pkg/endpoint/endpoint.go b/pkg/endpoint/endpoint.go
index 6b68a7cbc0..d08df9375b 100644
--- a/pkg/endpoint/endpoint.go
+++ b/pkg/endpoint/endpoint.go
@@ -614,6 +614,7 @@ func (e *Endpoint) waitForProxyCompletions(proxyWaitGroup *completion.WaitGroup)
err := proxyWaitGroup.Context().Err()
if err != nil {
+ proxyWaitGroup.Cancel()
return fmt.Errorf("context cancelled before waiting for proxy updates: %w", err)
}
diff --git a/pkg/endpointmanager/manager.go b/pkg/endpointmanager/manager.go
index d8625fa758..c14d967f1a 100644
--- a/pkg/endpointmanager/manager.go
+++ b/pkg/endpointmanager/manager.go
@@ -178,6 +178,7 @@ func (mgr *endpointManager) WithPeriodicEndpointRegeneration(ctx context.Context
func (mgr *endpointManager) waitForProxyCompletions(proxyWaitGroup *completion.WaitGroup) error {
err := proxyWaitGroup.Context().Err()
if err != nil {
+ proxyWaitGroup.Cancel()
return fmt.Errorf("context cancelled before waiting for proxy updates: %w", err)
}
There was a problem hiding this comment.
If we add a cancel() call to the WaitGroup.Wait() it does not need to be added here.
Yee, this is indeed the place where we see the most leaks.
There was a problem hiding this comment.
wait, it is actually needed since waitForProxyCompletions() skips calling proxyWaitGroup.Wait() if proxyWaitGroup's context has errored:
Yeah, thats why I don't fully like the idea of putting the cancel inside the Wait - since that would then require every single completion.WaitGroup to run Wait (or separately Close) to not leak.
|
Adding the |
|
Thanks for the comments @jrajahalme and @tommyp1ckles. Agree that a simpler approach is probably just as sane, and also much easier to backport. I opened #44328 so that we can do that first, and then potentially think about this solution later 😄 I'll just close this one out. |
|
Initial fix merged in #44328 now. Waiting for that to be backported before looking at this again. |
|
I'll close this in favor of #44731 |
See commit message for more info. This fixes a gnarly memory leak. Not sure if this is the best way to fix, and we need to look into the guarantees and write better docs - but its a start. I also want to run the tests to see if anything depend on this in a different way.
I also partially bisected this down to where the WithCancel was introduced, and didn't spend more time digging into that. This most likely affect all versions of cilium, although in practice its mostly affecting environments with a lot of churn.
I've also updated tests to "document" the existing behavior and how the new cancel works together with it.