Skip to content

fix: skip provision when IR Infra is invalid#7754

Merged
zirain merged 9 commits intoenvoyproxy:mainfrom
zirain:fix/7735
Feb 4, 2026
Merged

fix: skip provision when IR Infra is invalid#7754
zirain merged 9 commits intoenvoyproxy:mainfrom
zirain:fix/7735

Conversation

@zirain
Copy link
Copy Markdown
Member

@zirain zirain commented Dec 17, 2025

fixes: #7735

This patch follow up #7009, contains two fixes:

  1. Gateway with valid EnvoyPorxy and GatewayClass with invalid EnvoyProxy should be provisioned
  2. Failed gateways should not trigger IR deletion for Setting invalid values in EnvoyProxy deletes Deployment all together #7735

@zirain zirain marked this pull request as ready for review December 17, 2025 03:27
@zirain zirain requested a review from a team as a code owner December 17, 2025 03:27
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.71%. Comparing base (79af9fe) to head (422b0d6).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7754      +/-   ##
==========================================
+ Coverage   73.67%   73.71%   +0.03%     
==========================================
  Files         241      241              
  Lines       36561    36569       +8     
==========================================
+ Hits        26937    26957      +20     
+ Misses       7712     7704       -8     
+ Partials     1912     1908       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

// For https://github.com/envoyproxy/gateway/issues/7735, if a gateway/gatewayclass
// referenced an invalid EnvoyProxy, it's part of failed gateways, we shouldn't delete
// the IR.
for _, gtw := range result.Gateways {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel like a workaround, can we address this in a better way in the translator to populate Status w/o trckling down error to downstream callers

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about add a invalid/valid into Infra?

type Infra struct {
	Invalid *bool
	// Proxy defines managed proxy infrastructure.
	Proxy *ProxyInfra `json:"proxy" yaml:"proxy"`
}

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. Please feel free to give a status update now, ping for review, when it's ready. Thank you for your contributions!

@netlify
Copy link
Copy Markdown

netlify bot commented Jan 20, 2026

Deploy Preview for cerulean-figolla-1f9435 canceled.

Name Link
🔨 Latest commit 422b0d6
🔍 Latest deploy log https://app.netlify.com/projects/cerulean-figolla-1f9435/deploys/6982dbb68258be0008aa23ff

@zirain zirain changed the title fix: do not trigger IR deletion when EnvoyProxy is invalid fix: skip provision when IR Infra is invalid Jan 20, 2026
@zirain zirain requested a review from arkodg January 20, 2026 11:10
@zirain zirain force-pushed the fix/7735 branch 2 times, most recently from 03445f7 to 93c06e9 Compare January 20, 2026 12:31
}
} else {
// Manage the proxy infra.
// Skip creating or updating infra if the Infra IR is invalid.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to better understand why this is needed

  • a failedGateway in the gateway API layer impacts Status of the Gateway
  • if we fail, we dont insert any invalid values in the IR
  • we have validations in the infra layer deal with errors

we are trying to allow partial valid IRs to go through

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont insert any invalid values in the IR

if we do this, there will be a delete event in Infra layer.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you elaborate here

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#7735 (comment) showed the delete event.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats because it most likely deleted the infra IR instead of creating an Infra IR with unset stat val

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's what I did in internal/gatewayapi/translator.go L474.

@arkodg arkodg added this to the v1.7.0 Release milestone Jan 28, 2026
type Infra struct {
// Invalid indicates whether the IR is invalid.
// This is an optional bool so that we won't update all existing files.
Invalid *bool `json:"invalid,omitempty" yaml:"invalid,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed anymore ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@zirain zirain force-pushed the fix/7735 branch 2 times, most recently from 8fceb39 to e5af1ce Compare January 30, 2026 05:37

// InitIRs checks if mergeGateways is enabled in EnvoyProxy config and initializes XdsIR and InfraIR maps with adequate keys.
func (t *Translator) InitIRs(gateways []*GatewayContext) (map[string]*ir.Xds, map[string]*ir.Infra) {
func (t *Translator) InitIRs(acceptedGateways, failedGateways []*GatewayContext) (map[string]*ir.Xds, map[string]*ir.Infra) {
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the two loops inside this method are exact the same and can be a bit confusing when reading the code(I expected there are different behaviors when I reviewed the code at the first time).

It can be simplified as:

func (t *Translator) InitIRs(gateways []*GatewayContext) (map[string]*ir.Xds, map[string]*ir.Infra) {

with one loop.

Also, it would be good to add a line of comment to the caller to explain why the Infra IRs of failed Gateways also need to be created.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't merge them, in most of the cases, we just need acceptedGateways.
failedGateways will be removed in the next iteration.

zhaohuabing
zhaohuabing previously approved these changes Feb 3, 2026
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
}
gCtx.attachEnvoyProxy(resources, envoyproxyMap)

if gateway.Spec.GatewayClassName != t.GatewayClassName {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be moved down

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move back.

port: 8080
protocol: HTTP
status:
conditions:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up, should this be Accepted:True and ResolvedRefs:False ?
cc @zhaohuabing

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use RefNotPermitted here?

	// This condition indicates whether the controller was able to resolve all
	// the object references for the Gateway that are not part of a specific
	// Listener configuration, and also provides a positive-polarity summary of
	// Listener's "ResolvedRefs" condition. This condition does not directly
	// impact the Gateway's Accepted or Programmed conditions.
	//
	// Possible reasons for this condition to be True are:
	//
	// * "ResolvedRefs"
	//
	// Possible reasons for this condition to be False are:
	//
	// * "RefNotPermitted"
	// * "InvalidClientCertificateRef"
	// * "ListenersNotResolved"
	//
	// Controllers may raise this condition with other reasons, but should
	// prefer to use the reasons listed above to improve interoperability.
	//
	// Note: This condition is considered Experimental and may change in future
	// releases of the API.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably ListenersNotResolved, from https://github.com/kubernetes-sigs/gateway-api/blob/8ecfe98081f74aec9eb8c44ef265380e2edb9094/apis/v1/gateway_types.go#L1323
RefNotPermitted seems more scoped to ReferenceGrant issues

Signed-off-by: zirain <zirain2009@gmail.com>
@zirain
Copy link
Copy Markdown
Member Author

zirain commented Feb 4, 2026

/retest

@zirain zirain requested a review from zhaohuabing February 4, 2026 08:13
Copy link
Copy Markdown
Member

@cnvergence cnvergence left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@zirain zirain merged commit 93da4ca into envoyproxy:main Feb 4, 2026
78 of 82 checks passed
@zirain zirain deleted the fix/7735 branch February 4, 2026 09:58
cnvergence pushed a commit to cnvergence/gateway that referenced this pull request Feb 5, 2026
* fix: do not trigger IR deletion when EnvoyProxy is invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add Invalid to ir.Infra

Signed-off-by: zirain <zirain2009@gmail.com>

* fix gen

Signed-off-by: zirain <zirain2009@gmail.com>

* add e2e

Signed-off-by: zirain <zirain2009@gmail.com>

* remove invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add comments

Signed-off-by: zirain <zirain2009@gmail.com>

* update

Signed-off-by: zirain <zirain2009@gmail.com>

* merge loop

Signed-off-by: zirain <zirain2009@gmail.com>

* move back

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
cnvergence pushed a commit to cnvergence/gateway that referenced this pull request Feb 5, 2026
* fix: do not trigger IR deletion when EnvoyProxy is invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add Invalid to ir.Infra

Signed-off-by: zirain <zirain2009@gmail.com>

* fix gen

Signed-off-by: zirain <zirain2009@gmail.com>

* add e2e

Signed-off-by: zirain <zirain2009@gmail.com>

* remove invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add comments

Signed-off-by: zirain <zirain2009@gmail.com>

* update

Signed-off-by: zirain <zirain2009@gmail.com>

* merge loop

Signed-off-by: zirain <zirain2009@gmail.com>

* move back

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>
cnvergence pushed a commit to cnvergence/gateway that referenced this pull request Feb 5, 2026
* fix: do not trigger IR deletion when EnvoyProxy is invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add Invalid to ir.Infra

Signed-off-by: zirain <zirain2009@gmail.com>

* fix gen

Signed-off-by: zirain <zirain2009@gmail.com>

* add e2e

Signed-off-by: zirain <zirain2009@gmail.com>

* remove invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add comments

Signed-off-by: zirain <zirain2009@gmail.com>

* update

Signed-off-by: zirain <zirain2009@gmail.com>

* merge loop

Signed-off-by: zirain <zirain2009@gmail.com>

* move back

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>
cnvergence added a commit that referenced this pull request Feb 5, 2026
* chore(docs): Update Azure Entra link in OIDC guide (#8167)

Update Azure Entra link in OIDC guide

Signed-off-by: Guy Daich <guy.daich@sap.com>

* fix: continue processing the remaining xDS with invalid EnvoyPatchPolicies (#8153)

continue processing the remaining xDS with invalid EnvoyPatchPolicies

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* build(deps): bump the actions group across 1 directory with 2 updates (#8178)

Bumps the actions group with 2 updates in the / directory: [docker/login-action](https://github.com/docker/login-action) and [github/codeql-action](https://github.com/github/codeql-action).

Updates `docker/login-action` from 3.6.0 to 3.7.0
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](docker/login-action@5e57cd1...c94ce9f)

Updates `github/codeql-action` from 4.32.0 to 4.32.1
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@b20883b...6bc82e0)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-version: 3.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: actions
- dependency-name: github/codeql-action
  dependency-version: 4.32.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Isaac Wilson <10012479+jukie@users.noreply.github.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* fix: skip provision when IR Infra is invalid (#7754)

* fix: do not trigger IR deletion when EnvoyProxy is invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add Invalid to ir.Infra

Signed-off-by: zirain <zirain2009@gmail.com>

* fix gen

Signed-off-by: zirain <zirain2009@gmail.com>

* add e2e

Signed-off-by: zirain <zirain2009@gmail.com>

* remove invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add comments

Signed-off-by: zirain <zirain2009@gmail.com>

* update

Signed-off-by: zirain <zirain2009@gmail.com>

* merge loop

Signed-off-by: zirain <zirain2009@gmail.com>

* move back

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* docs: add HTTP header and method based authentication task (#7990)

* docs: add HTTP header and method based authentication task

Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>

* docs: replace api-key examples with user header

Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>

* docs: format header and method authentication examples

Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>

* docs: add header and method based authorization examples

Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>

---------

Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* fix: Validation of XListenerSet certificateRefs (#8168)

Previously, validateTerminateModeAndGetTLSSecrets would always use the
namespace of the listener's gateway when verifying a cross-namespace
ref.

This meant that if the listener were from an XListenerSet, whether or
not the Secret associated with the certificateRef was in the same
namespace as the XListenerSet, it would not be permitted.

Additionally, and relatedly, this fixes an issue where an XListenerSet
could reference a Secret in the gateway's namespace without a
ReferenceGrant being present.

With this change we add a new GetNamespace() method to
gatewayapi.ListenerContext which returns the listener's gateway's
namespace for a listener added directly to the gateway, or the
XListenerSet's namespace otherwise. This is similar to some of the other
methods that were added to ListenerContext in support of XListenerSets.

The new method is used when creating the `crossNamespaceFrom` to
determine if the certificateRef is permitted. If the Secret and
XListenerSet are in the same namespace, it is permitted. If that is not
the case a ReferenceGrant from the XListenerSet to the Secret will be
properly searched for.

Signed-off-by: krishicks <kris@krishicks.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* fix: Remove whitespace for nodeSelector in deployment YAML - helm chart change (#8185)

Remove whitespace for nodeSelector in deployment YAML

Signed-off-by: Jess Belliveau <jess.belliveau@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

* [release/v1.7.0] release notes (#8188)

Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>

---------

Signed-off-by: Guy Daich <guy.daich@sap.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Karol Szwaj <karol.szwaj@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>
Signed-off-by: krishicks <kris@krishicks.com>
Signed-off-by: Jess Belliveau <jess.belliveau@gmail.com>
Co-authored-by: Guy Daich <guy.daich@sap.com>
Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Isaac Wilson <10012479+jukie@users.noreply.github.com>
Co-authored-by: zirain <zirain2009@gmail.com>
Co-authored-by: Aditya Sanskar Srivastav <161202916+Aditya7880900936@users.noreply.github.com>
Co-authored-by: krishicks <kris@krishicks.com>
Co-authored-by: Jess Belliveau <jess.belliveau@gmail.com>
Inode1 pushed a commit to Inode1/gateway that referenced this pull request Feb 23, 2026
* fix: do not trigger IR deletion when EnvoyProxy is invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add Invalid to ir.Infra

Signed-off-by: zirain <zirain2009@gmail.com>

* fix gen

Signed-off-by: zirain <zirain2009@gmail.com>

* add e2e

Signed-off-by: zirain <zirain2009@gmail.com>

* remove invalid

Signed-off-by: zirain <zirain2009@gmail.com>

* add comments

Signed-off-by: zirain <zirain2009@gmail.com>

* update

Signed-off-by: zirain <zirain2009@gmail.com>

* merge loop

Signed-off-by: zirain <zirain2009@gmail.com>

* move back

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting invalid values in EnvoyProxy deletes Deployment all together

4 participants