[Filebeat] Fix reference leak in TCP and Unix socket inputs#19459
[Filebeat] Fix reference leak in TCP and Unix socket inputs#19459andrewkroh merged 4 commits intoelastic:masterfrom
Conversation
86c05b5 to
11d959c
Compare
|
Pinging @elastic/integrations-services (Team:Services) |
|
Good find. Please backport to 7.8 as well before FF. |
urso
left a comment
There was a problem hiding this comment.
LGTM.
I consider context and WaitGroup quite low level primitives. They require you to 'track' usage and even free resources (cancel functon MUST be called). We've got some helpers here: https://pkg.go.dev/github.com/elastic/go-concert@v0.0.3/ctxtool?tab=doc, and some kind of with cancelation here: https://pkg.go.dev/github.com/elastic/go-concert?tab=doc#OnceSignaler
The context in Listener could be replaced with OnceSignaler and the stop could be replaced handled via 'WithFunc'.
For higher level primitives managing ownership of groups of go-routines check out the Group interface here: https://pkg.go.dev/github.com/elastic/go-concert@v0.0.3/unison?tab=doc#Group
We've got a sample implementation with TaskGroup. If you keep StopOnError false and do not return an error the implementation perfectly fits this use-case. starting the handler routine would look like this:
l.handlerGroup.Go(func(canceler unison.Canceler) error {
ctx, cancel := ctxtool.WithFunc(ctxtool.FromCanceller(canceler), func() {
conn.Close()
})
defer cancel()
defer logp.Recover(...)
...
return nil
})
The group combines shutdown signaling and the WaitGroup when calling 'Stop'. Once 'Stop' has been initiated any attempt to start another handler via Go will return an error, because we are about to shutdown.
There was a problem hiding this comment.
The handler go-routine is started always. Let's move cancellation setup local to the that go-routine. Consider to use ctxtool.WithFunc.
There was a problem hiding this comment.
Nice, I updated it to use ctxtool.WithFunc.
The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map.
The inputsource/common.Closer was managing bidirectional links between parents and children. Anytime you closed an instance it would close all of its children and also remove itself from its parents list of children (this is where the bug was). Every instance has its own mutex. While recursively closing children it was easy to run into a deadlock because the parent holds a lock while closing its children and then the child must edit the parent to remove itself so it also tries to acquire the parent lock. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code.
96a4155 to
b524da0
Compare
|
fyi @vjsamuel this might affect you? |
|
Thanks. I will need to redo our internal input during the next rebase. Moving to context is the right thing to do however. |
…ne-beats * upstream/master: (105 commits) ci: enable packaging job (elastic#19536) ci: disable upstream trigger on PRs for the packaging job (elastic#19490) Implement memlog on-disk handling (elastic#19408) fix go.mod for PR elastic#19423 (elastic#19521) [MetricBeat] add param `aws_partition` to support aws-cn, aws-us-gov regions (elastic#19423) Input v2 stateless manager (elastic#19406) Input v2 compatibility layer (elastic#19401) [Elastic Agent] Fix artifact downloading to allow endpoint-security to be downloaded (elastic#19503) fix: ignore target changes on scans (elastic#19510) Add more helpers to pipeline/testing package (elastic#19405) Report dependencies in CSV format (elastic#19506) [Filebeat] Fix reference leak in TCP and Unix socket inputs (elastic#19459) Cursor input skeleton (elastic#19378) Add changelog. (elastic#19495) [DOC] Typo in Kerberos (elastic#19265) Remove accidentally commited unused NOTICE template (elastic#19485) [Elastic Agent] Support the install, control, and uninstall of Endpoint (elastic#19248) [Filebeat][httpjson] Add split_events_by config setting (elastic#19246) ci: disabling packaging job until we fix it (elastic#19481) Fix golang.org/x/tools to release1.13 (elastic#19478) ...
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…19501) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846)
…nix socket inputs (#19502) * [Filebeat] Fix reference leak in TCP and Unix socket inputs (#19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 61f4846) * Update vendor and notice
…19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code.
…P and Unix socket inputs (elastic#19502) * [Filebeat] Fix reference leak in TCP and Unix socket inputs (elastic#19459) The tcp and unix input sources were leaking references causing a memory leak. When an accepted connection ended inputsource/common.Closer was supposed to delete the pointer that it held to the connection, but due to a code error `delete` was being called on the wrong map. Instead of modifying the common.Closer I replaced it with a cancellable context.Context which is designed to propagate signals from parent to children and requires less code. (cherry picked from commit 8eb8ed7) * Update vendor and notice
What does this PR do?
The tcp and unix input sources were leaking references causing a memory leak.
When an accepted connection ended inputsource/common.Closer was
supposed to delete the pointer that it held to the connection, but due to a code
error
deletewas being called on the wrong map.Why is it important?
This fixes a memory leak with the tcp and unix inputs.
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.