fix: race in gatewaapi runner#8037
Conversation
✅ Deploy Preview for cerulean-figolla-1f9435 canceled.
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8037 +/- ##
==========================================
+ Coverage 73.70% 73.75% +0.04%
==========================================
Files 237 237
Lines 35703 35709 +6
==========================================
+ Hits 26316 26338 +22
+ Misses 7529 7515 -14
+ Partials 1858 1856 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
internal/gatewayapi/runner/runner.go
Outdated
| r.Logger = r.Logger.WithName(r.Name()).WithValues("runner", r.Name()) | ||
|
|
||
| go r.startWasmCache(ctx) | ||
| r.done.Add(2) |
There was a problem hiding this comment.
can you add a comment here, outlining why 2 is needed here
0314fcc to
9535661
Compare
| keyCache *KeyCache | ||
|
|
||
| // Goroutine synchronization | ||
| done sync.WaitGroup |
There was a problem hiding this comment.
nit: name it as wg? done.Done() was confusing at first glance.
internal/gatewayapi/runner/runner.go
Outdated
| go r.startWasmCache(ctx) | ||
| // Add 2 to the WaitGroup: one for the WASM cache server goroutine and one for the | ||
| // subscribeAndTranslate goroutine that handles resource translation | ||
| r.done.Add(2) |
There was a problem hiding this comment.
Would we want to add 2 right away or Add(1) just before calling each routine? We wouldn't want to wait on routines that might not have started for some reason. Also any new routine that we might add here will follow the same pattern.
// Increment by 1 specifically for the WASM cache
r.done.Add(1)
go func() {
defer r.done.Done()
r.startWasmCache(ctx)
}()
// If Subscribe crashes or returns an error, the WaitGroup
// won't be stuck waiting for a goroutine that never started.
c := r.ProviderResources.GatewayAPIResources.Subscribe(ctx)
// Increment by 1 specifically for the translation handler
r.done.Add(1)
go func() {
defer r.done.Done()
r.subscribeAndTranslate(c)
}()
There was a problem hiding this comment.
I have no strong opinion on this.
There was a problem hiding this comment.
+1 to keeping the Add close to the go func()
There was a problem hiding this comment.
we could simply it with r.done.Go()
| // t.Output() while goroutines are still active. | ||
| // | ||
| // Run with: go test -race -run TestRunnerGoroutineRace -count=100 ./internal/cmd/ | ||
| func TestRunnerGoroutineRace(t *testing.T) { |
There was a problem hiding this comment.
I see another test in runner_race_test, do we need this as well? Which one would we need to detect a race if someone spawns another routine without WG?
There was a problem hiding this comment.
this's in case we have another one unkonwn race, TBH it's hard to reproduce here.
1f73f66 to
80a7096
Compare
|
/retest |
80a7096 to
6ea93a3
Compare
6ea93a3 to
2d84080
Compare
* add testcase Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * simply Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com>
* add testcase Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * simply Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
* fix: extproc is discarded with failOpen is enabled for wasm (#7956) * fix: extproc is discarded with failOpen is enabled for wasm Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * polish code Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: sanitize control plane config dump (#7901) * mask secrets Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: server run race (#7964) * add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: wrong cluster type with mixed FQDN backend and service backend refs (#7994) * fix: wrong cluster type with mixed FQDN backend and service backend refs Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix mirror cluster endpoint type Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * simplify the test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * update comment Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: fail fast when unrecoverable discovery errors happens on checking optional CRDs (#7872) * fail fast when unrecoverable discovery errors happens Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * only retry transient errors Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix potenial dead lock Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * minor wording Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * create discovery client once Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix lint Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * remove redundant logging Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add e2e test Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: merge route match rule with match all route (#8011) Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: do not set autoHTTPConfig when used mixed(HTTP + HTTPS) backends (#7950) * fix: do not set autoHTTPConfig when used mixed backend Signed-off-by: zirain <zirain2009@gmail.com> * release notes Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * add e2e Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: backend tls default namespace (#7987) Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: race in gatewaapi runner (#8037) * add testcase Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * simply Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * [release/v1.6] v1.6.3 release notes (#8054) Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * v1.6.3 version Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix gen-check Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix lint Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> Signed-off-by: zirain <zirain2009@gmail.com> Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Co-authored-by: zirain <zirain2009@gmail.com>
* add testcase Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * simply Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Sadmi Bouhafs <sadmibouhafs@gmail.com>
fixes: #8035