Skip to content

Add Closer on Runner interface#5527

Merged
arkodg merged 8 commits intoenvoyproxy:mainfrom
mathetake:runnerclose
Mar 20, 2025
Merged

Add Closer on Runner interface#5527
arkodg merged 8 commits intoenvoyproxy:mainfrom
mathetake:runnerclose

Conversation

@mathetake
Copy link
Copy Markdown
Member

@mathetake mathetake commented Mar 17, 2025

What type of PR is this?

chore: add Close method interface on Runner interface

What this PR does / why we need it:

Previously, the runner "shudown" was not existent, so the main goroutine doesn't wait for the cleanup that is not synchronous to the context closure or channel. That includes virtually all the operations after the context cancelation running in non main goroutines. One notable example is that when running with the host provider, Envoy processes are spawned and managed from non-main goroutines, so without the "synchronous" closure hook on Runner, it is impossible to ensure the exit of Envoy processes. Without this fix, for example, standalone mode will leave all the Envoy processes as orphans after the CLI exits with SIGINT or SIGTERM.

As a side effect, this also fixes a bug in the host provider which is that its CreateOrUpdateProxyInfra blocks forever until the Envoy process is running. That is not the expected semantics of infra.Manager interface.

Release Notes: No

@mathetake
Copy link
Copy Markdown
Member Author

obviously i need to write a test

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 18, 2025

Codecov Report

Attention: Patch coverage is 39.02439% with 25 lines in your changes missing coverage. Please review.

Project coverage is 65.18%. Comparing base (7b8bed0) to head (0743eaa).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/infrastructure/host/proxy_infra.go 55.17% 12 Missing and 1 partial ⚠️
internal/cmd/server.go 0.00% 4 Missing ⚠️
internal/infrastructure/runner/runner.go 0.00% 2 Missing ⚠️
internal/gatewayapi/runner/runner.go 0.00% 1 Missing ⚠️
internal/globalratelimit/runner/runner.go 0.00% 1 Missing ⚠️
internal/infrastructure/kubernetes/infra.go 0.00% 1 Missing ⚠️
internal/provider/runner/runner.go 0.00% 1 Missing ⚠️
internal/xds/server/runner/runner.go 0.00% 1 Missing ⚠️
internal/xds/translator/runner/runner.go 0.00% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (39.02%) is below the target coverage (60.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5527      +/-   ##
==========================================
- Coverage   65.19%   65.18%   -0.02%     
==========================================
  Files         213      213              
  Lines       34059    34080      +21     
==========================================
+ Hits        22204    22214      +10     
- Misses      10529    10538       +9     
- Partials     1326     1328       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@mathetake mathetake marked this pull request as ready for review March 18, 2025 23:55
@mathetake mathetake requested a review from a team as a code owner March 18, 2025 23:55
@mathetake
Copy link
Copy Markdown
Member Author

ok ready

Comment on lines +254 to +258
for _, r := range runners {
if err := r.runner.Close(); err != nil {
cfg.Logger.Error(err, "failed to close runner", "name", r.runner.Name())
}
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the main change on the main routine besides the host infra closure

@mathetake
Copy link
Copy Markdown
Member Author

/retest

@mathetake
Copy link
Copy Markdown
Member Author

oh man the command exists here as well!

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Comment on lines +99 to +101
// This is ad-hoc sleep to wait for the Envoy process to start up to avoid canceling the context too early,
// which also results in the Envoy process being orphaned.
time.Sleep(1 * time.Second)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this seems an issue of func-e, not here and should be fixed there

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
@mathetake
Copy link
Copy Markdown
Member Author

lol we encountered a race in func-e

==================
WARNING: DATA RACE
Read at 0x00c000580218 by goroutine 251:
  os.(*ProcessState).ExitCode()
      /opt/hostedtoolcache/go/1.24.1/x64/src/os/exec_posix.go:145 +0xbf2
  github.com/tetratelabs/func-e/internal/envoy.(*Runtime).Run()
      /home/runner/go/pkg/mod/github.com/tetratelabs/func-e@v1.1.5-0.20250318224823-b69d1c096c7a/internal/envoy/run.go:95 +0xbb4
  github.com/tetratelabs/func-e/internal/cmd.NewRunCmd.func2()
      /home/runner/go/pkg/mod/github.com/tetratelabs/func-e@v1.1.5-0.20250318224823-b69d1c096c7a/internal/cmd/run.go:87 +0x129d
  github.com/urfave/cli/v2.(*Command).Run()
      /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.27.6/command.go:276 +0x15a1
  github.com/urfave/cli/v2.(*Command).Run()
      /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.27.6/command.go:269 +0x1464
  github.com/urfave/cli/v2.(*App).RunContext()
      /home/runner/go/pkg/mod/github.com/urfave/cli/v2@v2.27.6/app.go:333 +0x1228
  github.com/tetratelabs/func-e/api.Run.func1()
      /home/runner/go/pkg/mod/github.com/tetratelabs/func-e@v1.1.5-0.20250318224823-b69d1c096c7a/api/run.go:115 +0x84

Previous write at 0x00c000580218 by goroutine 256:
  os.(*Process).pidfdWait()
      /opt/hostedtoolcache/go/1.24.1/x64/src/os/pidfd_linux.go:116 +0x5aa
  os.(*Process).wait()
      /opt/hostedtoolcache/go/1.24.1/x64/src/os/exec_unix.go:27 +0x4d
  os.(*Process).Wait()
      /opt/hostedtoolcache/go/1.24.1/x64/src/os/exec.go:358 +0xaf
  os/exec.(*Cmd).Wait()
      /opt/hostedtoolcache/go/1.24.1/x64/src/os/exec/exec.go:922 +0x8f
  github.com/tetratelabs/func-e/internal/envoy.(*Runtime).Run.func1()
      /home/runner/go/pkg/mod/github.com/tetratelabs/func-e@v1.1.5-0.20250318224823-b69d1c096c7a/internal/envoy/run.go:64 +0x6a

@mathetake mathetake marked this pull request as draft March 19, 2025 02:00
@mathetake
Copy link
Copy Markdown
Member Author

marking as a draft for now

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
go.mod Outdated
github.com/stretchr/testify v1.10.0
github.com/telepresenceio/watchable v0.0.0-20220726211108-9bb86f92afa7
github.com/tetratelabs/func-e v1.1.5-0.20240822223546-c85a098d5bf0
github.com/tetratelabs/func-e v1.1.5-0.20250319205758-07e93e08850f
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is needed to have tetratelabs/func-e#458

@mathetake
Copy link
Copy Markdown
Member Author

great, passing with the latest func-e tetratelabs/func-e@ca8702e

@mathetake mathetake marked this pull request as ready for review March 19, 2025 21:16
Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg arkodg requested review from a team and shawnh2 March 20, 2025 00:04
@arkodg arkodg merged commit 896ca89 into envoyproxy:main Mar 20, 2025
23 of 25 checks passed
@mathetake mathetake deleted the runnerclose branch March 20, 2025 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants