Skip to content

feat(translator): Implement BTP Timeout API#2454

Merged
arkodg merged 1 commit intoenvoyproxy:mainfrom
guydc:impl-btp-timeout
Jan 29, 2024
Merged

feat(translator): Implement BTP Timeout API#2454
arkodg merged 1 commit intoenvoyproxy:mainfrom
guydc:impl-btp-timeout

Conversation

@guydc
Copy link
Copy Markdown
Contributor

@guydc guydc commented Jan 16, 2024

What this PR does / why we need it:
Implement the BackendTrafficPolicy Timeout API

Which issue(s) this PR fixes:
Fixes #2401

@guydc guydc requested a review from a team as a code owner January 16, 2024 23:53
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 16, 2024

Codecov Report

Attention: 84 lines in your changes are missing coverage. Please review.

Comparison is base (8d19d77) 64.70% compared to head (ed5ca57) 64.61%.
Report is 1 commits behind head on main.

Files Patch % Lines
internal/ir/zz_generated.deepcopy.go 0.00% 61 Missing and 1 partial ⚠️
internal/gatewayapi/backendtrafficpolicy.go 86.41% 7 Missing and 4 partials ⚠️
internal/gatewayapi/route.go 57.69% 9 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2454      +/-   ##
==========================================
- Coverage   64.70%   64.61%   -0.09%     
==========================================
  Files         116      116              
  Lines       17613    17797     +184     
==========================================
+ Hits        11396    11500     +104     
- Misses       5486     5560      +74     
- Partials      731      737       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use lower case here to make the func private - setBackend....

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer if this was renamed to Timeout and we moved the existing Timeout field into it and renamed it to RequestTimeout under HTTP

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two duration should be checked (whether is nil ptr) before assign values toIdleTimeout and MaxConnectionDuration, nor it will raise a nil pointer panic.

http:
- name: "first-listener"
  #...
  routes:
  - name: "first-route"
    backendTimeout:
      tcp:
        connectTimeout: "31s"
      http:
        connectionIdleTimeout: "32s"
        #maxConnectionDuration: "33s" # paniced
    #...

Copy link
Copy Markdown
Contributor

@shawnh2 shawnh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks

@zirain zirain requested a review from arkodg January 23, 2024 03:08
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation LGTM, thanks!

A minor comment on the name of Timeout structure in the API: we may want to change it to BackendTimeout because the TCP ConnectTimeout can't be reused for the future client-side Timeouts.

// BackendTimeout defines configuration for timeouts related to the backend.
type BackendTimeout struct {
	// Timeout settings for TCP.
	//
	// +optional
	TCP *TCPTimeout `json:"tcp,omitempty"`

	// Timeout settings for HTTP.
	//
	// +optional
	HTTP *HTTPTimeout `json:"http,omitempty"`
}

@guydc
Copy link
Copy Markdown
Contributor Author

guydc commented Jan 23, 2024

A minor comment on the name of Timeout structure in the API: we may want to change it to BackendTimeout because the TCP ConnectTimeout can't be reused for the future client-side Timeouts.

Can we translate this to transport socket timeout for downstream connections? If so, it can be reused...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: policy is computed/translated post route translation, so we can assume its empty here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to this to be safe in case there's a re-arrangement of the computation order. If you feel that we can make this assumption, i'll update the implementation accordingly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, if thats the case, can you also add a comment here, so the next dev can understand why this done this way similar to the comment you have in policy around request timeout

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should log error in status and continue processing instead of returning early here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happened here, why did 5s get deleted ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test is broken, probably needs below config

timeout:
  http:
    requestTimeout: 5s

@zhaohuabing
Copy link
Copy Markdown
Member

A minor comment on the name of Timeout structure in the API: we may want to change it to BackendTimeout because the TCP ConnectTimeout can't be reused for the future client-side Timeouts.

Can we translate this to transport socket timeout for downstream connections? If so, it can be reused...

Not sure about it, they look similar but not exactly the same. Are there any other possible different settings between client and backend timeouts?

@arkodg @zirain WDYT?

@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Jan 26, 2024

A minor comment on the name of Timeout structure in the API: we may want to change it to BackendTimeout because the TCP ConnectTimeout can't be reused for the future client-side Timeouts.

Can we translate this to transport socket timeout for downstream connections? If so, it can be reused...

Not sure about it, they look similar but not exactly the same. Are there any other possible different settings between client and backend timeouts?

@arkodg @zirain WDYT?

the go struct name can be changed anytime w/o breaking YAML API compatibility so no strong comments for now from my side

@guydc
Copy link
Copy Markdown
Contributor Author

guydc commented Jan 26, 2024

Regarding E2E test for this feature: echo server does support a /delay endpoint. However, since these settings are per-connection and not per-request, timed-out connections will not terminate active delayed requests. In other words, it's difficult to demonstrate a connection timeout on a specific request.

  • idle timeout: "The idle timeout is defined as the period in which there are no active requests."
  • max connection duration: "When max_connection_duration is reached and if there are no active streams, the connection will be closed"

One way to verify timeout settings with an e2e test is to trigger some upstream connections, wait for timeouts to occur and then check the relevant envoy cluster metrics.

I propose to handle the e2e test in a dedicated issue.

@arkodg
Copy link
Copy Markdown
Contributor

arkodg commented Jan 26, 2024

sure @guydc lets raise seperate GH issues for e2e and docs and tackle those in different PRs

arkodg
arkodg previously approved these changes Jan 26, 2024
Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

Signed-off-by: Guy Daich <guy.daich@sap.com>
@guydc
Copy link
Copy Markdown
Contributor Author

guydc commented Jan 29, 2024

/retest

Copy link
Copy Markdown
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg arkodg merged commit f1a0a42 into envoyproxy:main Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support upstream HTTP connection timeouts

4 participants