Skip to content

router: fixing inline write fails for upstream connections#19221

Merged
alyssawilk merged 3 commits intoenvoyproxy:mainfrom
alyssawilk:upstrea_socket_fail
Jan 4, 2022
Merged

router: fixing inline write fails for upstream connections#19221
alyssawilk merged 3 commits intoenvoyproxy:mainfrom
alyssawilk:upstrea_socket_fail

Conversation

@alyssawilk
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk commented Dec 8, 2021

Changing the router code such that we do deferred delete of the upstream request any time we might be under the stack of the upstream request. the upstream request is still deleted inline for timeouts and other alarms.
Note only HTTP/3 currently does inline writes, so this fixes a currently-http3-only issue

Risk Level: high (changes lifetime of upstream requests)
Testing: new integration test
Docs Changes: n/a
Release Notes: n/a
Runtime guard: envoy.reloadable_features.allow_upstream_inline_write

@alyssawilk
Copy link
Copy Markdown
Contributor Author

/wait

@alyssawilk alyssawilk self-assigned this Dec 8, 2021
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk changed the title test: adding upstream socket fail test router: fixing inline write fails for upstream connections Dec 9, 2021
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk assigned snowp and unassigned alyssawilk Dec 13, 2021
@alyssawilk
Copy link
Copy Markdown
Contributor Author

Hey Snow, I'm throwing this your way as Matt's out.
I am hoping to land this PR this week because it has some test infra I want to use elsewhere, but given IMO the lifetime change is pretty high risk and we're not sure if we're cutting the release in under 2 weeks, I'm inclined to get an LGTM (making sure all changes pass on all CI with the flag on) then flip the default false right before landing with a TODO to flip true after the release.

Copy link
Copy Markdown
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand, what is the motivation for this change? The runtime flag talks about inline writes so I'm guessing it's related to that?

The code seems right to be if we're trying to eagerly call the clean up logic behind the upstream requests, just curious why we want this :)

@alyssawilk
Copy link
Copy Markdown
Contributor Author

Ah, so when TCP does writes, it puts data in the connection buffer, waits for an event, and then writes on the next epoll loop. for UDP we just write directly to the socket. If the write fails, we close the connection under the stack of the write, and here we delete the upstream request while doing work in the upstream request, resulting in a use-after-free. This adds a (really synthetic) test of writing fail due to an injected error, and fixes up the code so we don't have a use-after-free.
Some day I'd like to fix TCP to do inline writes because from a perf perspective it's pretty helpful (cc @wbpcode) but before we think of doing that we need to actually handle inline writes correctly which this PR does (and regression tests)

Copy link
Copy Markdown
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the explanation!

@alyssawilk alyssawilk merged commit 84e97d7 into envoyproxy:main Jan 4, 2022
joshperry pushed a commit to joshperry/envoy that referenced this pull request Feb 13, 2022
…y#19221)

Changing the router code such that we do deferred delete of the upstream request any time we might be under the stack of the upstream request. the upstream request is still deleted inline for timeouts and other alarms.
Note only HTTP/3 currently does inline writes, so this fixes a currently-http3-only issue

Risk Level: high (changes lifetime of upstream requests)
Testing: new integration test
Docs Changes: n/a
Release Notes: n/a
Runtime guard: envoy.reloadable_features.allow_upstream_inline_write
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Josh Perry <josh.perry@mx.com>
@alyssawilk alyssawilk deleted the upstrea_socket_fail branch August 4, 2022 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants