Skip to content

chore: improve HTTPRoute status errors#5747

Merged
zhaohuabing merged 12 commits intoenvoyproxy:mainfrom
zhaohuabing:status-errors
Apr 24, 2025
Merged

chore: improve HTTPRoute status errors#5747
zhaohuabing merged 12 commits intoenvoyproxy:mainfrom
zhaohuabing:status-errors

Conversation

@zhaohuabing
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing commented Apr 17, 2025

This PR includes the following improvements:

  • Fixes tests inputs in the Gateway API translator tests. Some tests were missing services and endpointSlices resources, which led to 500 errors in the output files. This PR adds the missing services and endpointSlices to ensure the expected behavior.
  • Improve the HTTPRoute status: Translation errors of all the HTTPRouteRules are now surfaced to the HTTPRoute status. This makes it easier for users to find out which parts of their config are invalid.
  • Consolidate all the HTTPRoute translation error handling in a single location (processHTTPRouteParentRefs) rather than updating status throughout the code. This simplifies the HTTPRoute translation logic and allows us to aggregate all errors into a single status message for better clarity. This is not totally done yet, will clean them up in a follow-up PR after handling TCP/TLS/UDP route statuses.

To keep the scope managed and make this easier to review, this PR only modifies the HTTPRoute. If this makes sense, I'll raise a follow-up PR to also clean up the status for other routes.

@zhaohuabing zhaohuabing requested a review from a team as a code owner April 17, 2025 00:54
@zhaohuabing zhaohuabing marked this pull request as draft April 17, 2025 00:54
@zhaohuabing zhaohuabing changed the title Status errors chore: improve HTTPRoute status errors Apr 17, 2025
@zhaohuabing zhaohuabing force-pushed the status-errors branch 3 times, most recently from 18a6b48 to 4267457 Compare April 17, 2025 01:07
@zhaohuabing zhaohuabing marked this pull request as ready for review April 17, 2025 01:08
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 62.03390% with 112 lines in your changes missing coverage. Please review.

Project coverage is 65.15%. Comparing base (096cb8d) to head (19fdade).
Report is 33 commits behind head on main.

Files with missing lines Patch % Lines
internal/gatewayapi/status/error.go 0.00% 88 Missing ⚠️
internal/gatewayapi/route.go 86.73% 12 Missing and 1 partial ⚠️
internal/gatewayapi/validate.go 92.50% 5 Missing and 1 partial ⚠️
internal/gatewayapi/backend.go 80.95% 4 Missing ⚠️
internal/gatewayapi/listener.go 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5747      +/-   ##
==========================================
- Coverage   65.19%   65.15%   -0.05%     
==========================================
  Files         214      217       +3     
  Lines       34321    34865     +544     
==========================================
+ Hits        22377    22717     +340     
- Misses      10591    10753     +162     
- Partials     1353     1395      +42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zhaohuabing zhaohuabing requested a review from arkodg April 17, 2025 01:17
@zhaohuabing zhaohuabing force-pushed the status-errors branch 3 times, most recently from 07edb2a to 536d226 Compare April 17, 2025 01:28
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
@zhaohuabing zhaohuabing requested a review from shawnh2 April 17, 2025 04:36
mergeSlashes: true
port: 10080
routes:
- directResponse:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesnt look right

Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is actually irrelevant to the status: I found that httproute-with-multiple-gateways-from-different-ns.in.yaml is invalid, missing the backend, so I added it.

The purpose of this test is to verify we have the right namespace for the parent Gateway in the route status, not 500 direct response, so it should have valid backends in it.

see: #4592

@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Apr 17, 2025

@zhaohuabing can we try to make sure only the status messages are updated in the test file and not the reasons or conditions, we'll need to revisit the entire status spec again if changes are made In that area

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
message: 'Failed to process route rule 0 backendRef 0: no ca found in configmap
no-ca-cmap.'
reason: InvalidBackendTLS
status: "False"
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should set ResolvedResf to false here, since according to the Gateway API, "ResolvedRefs which is set to true when all references to other resources have been successfully resolved."

Surfacing the TLS-related failures to Route status will help users - it gives them a single place to understand why a Route isn't working, rather than having to dig through all the related resources.

Reference: https://gateway-api.sigs.k8s.io/geps/gep-1364/#all-conditions-positive

secret for client auth: envoy-gateway-system/client-auth-not-found specified
in EnvoyProxy envoy-gateway-system/test.'
reason: InvalidBackendTLS
status: "False"
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

- lastTransitionTime: null
message: 'backend Service validation failed: TCP Port 8081 not found on Service
default/service-1'
message: TCP Port 8081 not found on Service default/service-1
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to handle this in a follow-up PR, and use the same message format as HTTPRoute.

message: Resolved all the Object references for the Route
reason: ResolvedRefs
status: "True"
type: ResolvedRefs
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior follows the existing logic: Even though the Accepted condition is false, the ResolveRefs can be true if all the Backends are successfully resolved.

Existing tests:

- conditions:
- lastTransitionTime: null
message: There are no ready listeners for this parent ref
reason: NoReadyListeners
status: "False"
type: Accepted
- lastTransitionTime: null
message: Resolved all the Object references for the Route
reason: ResolvedRefs
status: "True"
type: ResolvedRefs
controllerName: gateway.envoyproxy.io/gatewayclass-controller

@zhaohuabing zhaohuabing requested a review from arkodg April 18, 2025 03:13
Secret is not located in the same namespace as Envoyproxy. Secret namespace:
envoy-gateway-user-ns does not match Envoyproxy namespace: envoy-gateway-system.'
reason: InvalidBackendTLS
status: "False"
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

type: Accepted
- lastTransitionTime: null
message: Mixed endpointslice address type between backendRefs is not supported
reason: ResolvedRefs
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

type: Accepted
- lastTransitionTime: null
message: mixed endpointslice address type for the same backendRef is not supported
reason: ResolvedRefs
Copy link
Copy Markdown
Member Author

@zhaohuabing zhaohuabing Apr 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
type: Accepted
- lastTransitionTime: null
message: Mixed endpointslice address type between backendRefs is not supported
reason: ResolvedRefs
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

reason: UnsupportedValue
status: "False"
type: Accepted
- lastTransitionTime: null
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
@zhaohuabing
Copy link
Copy Markdown
Member Author

zhaohuabing commented Apr 18, 2025

@zhaohuabing can we try to make sure only the status messages are updated in the test file and not the reasons or conditions, we'll need to revisit the entire status spec again if changes are made In that area

@arkodg I’ve reverted the standard status reasons back to their original values. A few non-standard ones have been updated to better reflect the specific errors — but if you prefer keeping the older ones for consistency, I’m happy to revert those as well.

@zhaohuabing zhaohuabing requested review from a team April 18, 2025 07:29
name: policy-btls/envoy-gateway-ca
sni: example.com
weight: 1
directResponse:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this change ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes tests inputs in the Gateway API translator tests. Some tests were missing services and endpointSlices resources, which led to 500 errors in the output files. This PR adds the missing services and endpointSlices to ensure the expected behavior.

type: Accepted
- lastTransitionTime: null
message: Service envoy-gateway/service-1 not found
reason: BackendNotFound
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like you are intentionally cleaning up / fixing these tests ? are these tests for the happy path or error path ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this same as #5747 (comment) ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I fixed these tests, which are missing services and endpoints in their inputs.

Fixes tests inputs in the Gateway API translator tests. Some tests were missing services and endpointSlices resources, which led to 500 errors in the output files. This PR adds the missing services and endpointSlices to ensure the expected behavior.

mergeSlashes: true
port: 10080
routes:
- directResponse:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did these get removed ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes tests inputs in the Gateway API translator tests. Some tests were missing services and endpointSlices resources, which led to 500 errors in the output files. This PR adds the missing services and endpointSlices to ensure the expected behavior.

Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@arkodg arkodg requested review from a team April 23, 2025 19:27
Copy link
Copy Markdown
Contributor

@guydc guydc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhaohuabing zhaohuabing merged commit a80f5f8 into envoyproxy:main Apr 24, 2025
25 checks passed
@zhaohuabing zhaohuabing deleted the status-errors branch April 24, 2025 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants