Conversation
✅ Deploy Preview for cerulean-figolla-1f9435 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
failed as excepted: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7964 +/- ##
==========================================
+ Coverage 72.85% 73.53% +0.68%
==========================================
Files 237 237
Lines 35623 35631 +8
==========================================
+ Hits 25952 26203 +251
+ Misses 7824 7567 -257
- Partials 1847 1861 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
nice! |
|
can we enable the -race flag before releasing this |
|
Hi @zirain is it easy to hit this? |
the testcase is reproducible. |
it's enabled, the problem is lack of test case. |
9699599 to
6483c24
Compare
internal/cmd/server.go
Outdated
| }, | ||
| ) | ||
| return server(cmd.Context(), cmd.OutOrStdout(), cmd.ErrOrStderr(), runnerErrors) | ||
| started := &atomic.Bool{} |
There was a problem hiding this comment.
It doesn't look like this is being read or used anywhere, can this be removed?
There was a problem hiding this comment.
this's useful to allow us konw when the server is ready and when to trigger a config update.
we can update this to a callback.
There was a problem hiding this comment.
Could we structure the tests differently if that's the only place it's being used?
There was a problem hiding this comment.
Only a minor nit so non-blocking
a5fad1d to
de06900
Compare
|
this is the 3rd fix in this area, can we take a step back and redesign what server/runner initialization, handling partial failures, and config restarts should look like ? |
Signed-off-by: zirain <zirain2009@gmail.com>
the main reason of the regssion is lack of reason.
I think the new test case cover them. |
|
can you elaborate on what the logic is / should be for
|
AIGW required #7880 fix the leak, but there's a race when using WaitGroup with multi goroutinues. This PR use Update above to PR description. |
I agree with the sentiment here, but I also think that given we're close to the desired date for 1.7 release, and the fact that this should be ideally cherry-picked to previous versions, it is convenient to merge the fix (it's safer to cherry-pick a small fix than a refactoring), unblock the projects that depend on this, and then plan and work for a proper redesign. |
|
|
||
| // TODO: we need to make sure that all runners are stopped, before we start the new ones | ||
| // Otherwise we might end up with error listening on:8081 | ||
| time.Sleep(3 * time.Second) |
There was a problem hiding this comment.
Should we check whether the ports used by the goaway runners are available before starting new runners?
There was a problem hiding this comment.
we didn't need if we run sequentially.
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
|
/retest |
* add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com>
* add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
* fix: fail fast when unrecoverable discovery errors happens on checking optional CRDs (#7872) * fail fast when unrecoverable discovery errors happens Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * only retry transient errors Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix potenial dead lock Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * minor wording Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * create discovery client once Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix lint Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * remove redundant logging Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add e2e test Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix: extproc is discarded with failOpen is enabled for wasm (#7956) * fix: extproc is discarded with failOpen is enabled for wasm Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * polish code Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix: sanitize control plane config dump (#7901) * mask secrets Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix: server run race (#7964) * add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> * fix: wrong cluster type with mixed FQDN backend and service backend refs (#7994) * fix: wrong cluster type with mixed FQDN backend and service backend refs Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix mirror cluster endpoint type Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * simplify the test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * update comment Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix: merge route match rule with match all route (#8011) Signed-off-by: zirain <zirain2009@gmail.com> * fix gen Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix for golang 11.24 Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix watch CRD version Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: zirain <zirain2009@gmail.com> Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
* fix: extproc is discarded with failOpen is enabled for wasm (#7956) * fix: extproc is discarded with failOpen is enabled for wasm Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * polish code Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * add test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: sanitize control plane config dump (#7901) * mask secrets Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: server run race (#7964) * add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: wrong cluster type with mixed FQDN backend and service backend refs (#7994) * fix: wrong cluster type with mixed FQDN backend and service backend refs Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * fix mirror cluster endpoint type Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * simplify the test Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> * update comment Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: fail fast when unrecoverable discovery errors happens on checking optional CRDs (#7872) * fail fast when unrecoverable discovery errors happens Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * only retry transient errors Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix potenial dead lock Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * minor wording Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * create discovery client once Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix lint Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * address comments Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * remove redundant logging Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * add e2e test Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> * fix test Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: merge route match rule with match all route (#8011) Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: do not set autoHTTPConfig when used mixed(HTTP + HTTPS) backends (#7950) * fix: do not set autoHTTPConfig when used mixed backend Signed-off-by: zirain <zirain2009@gmail.com> * release notes Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * add e2e Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: backend tls default namespace (#7987) Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix: race in gatewaapi runner (#8037) * add testcase Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * simply Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * [release/v1.6] v1.6.3 release notes (#8054) Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * v1.6.3 version Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix gen-check Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> * fix lint Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> --------- Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com> Signed-off-by: zirain <zirain2009@gmail.com> Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com> Co-authored-by: zirain <zirain2009@gmail.com>
* add test Signed-off-by: zirain <zirain2009@gmail.com> * fix race Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * use Semaphore instead of WaitGroup Signed-off-by: zirain <zirain2009@gmail.com> * comments Signed-off-by: zirain <zirain2009@gmail.com> * lint Signed-off-by: zirain <zirain2009@gmail.com> * fix Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * callback Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * run hook sequentially Signed-off-by: zirain <zirain2009@gmail.com> * fix lint Signed-off-by: zirain <zirain2009@gmail.com> * rename to cfgMux Signed-off-by: zirain <zirain2009@gmail.com> --------- Signed-off-by: zirain <zirain2009@gmail.com> Signed-off-by: Sadmi Bouhafs <sadmibouhafs@gmail.com>
xref: envoyproxy/ai-gateway#1770 (comment) #7880
AIGW required
Wait for the runners to close, see #5560.There's a problem is that
runnersDoneis blocked untilctx.Done(), which cause goroutine leak when config reloading.#7880 fix the leak, but there's a race when using WaitGroup with multi goroutinues.
This PR use
semaphorewith weight 2, to handle all of these cases. seeLoader.Wait().