Skip to content

hcm: log before onDestroy#12850

Merged
snowp merged 5 commits intoenvoyproxy:masterfrom
kyessenov:log_before_destroy
Aug 31, 2020
Merged

hcm: log before onDestroy#12850
snowp merged 5 commits intoenvoyproxy:masterfrom
kyessenov:log_before_destroy

Conversation

@kyessenov
Copy link
Copy Markdown
Contributor

@kyessenov kyessenov commented Aug 27, 2020

Signed-off-by: Kuat Yessenov kuat@google.com

Commit Message: call log() before onDestroy(), as well as populate final stream info details at the time of log() rather than the deferred destruction.

Additional Description:
Risk Level: low (might affect timings of events)
Testing: unit
Docs Changes: none
Release Notes: none
Fixes #12616
Fixes #12840

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
if (connection_manager_.codec_->protocol() < Protocol::Http2) {
// For HTTP/2 there are still some reset cases where details are not set.
// For HTTP/1 there shouldn't be any. Regression-proof this.
ASSERT(filter_manager_.streamInfo().responseCodeDetails().has_value());
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snowp @alyssawilk I suspect there is some race here with onEvent(RemoteClose) and deferred delete. We may not be able to detect remote close if we log immediately at the end of the stream. Any thoughts?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to move logs around, think I'd like this assert removed to before we log (we want to make sure details are set before stream info is shipped to access logs)

I don't think we should have to set DownstreamRemoteDisconnect in two places, and I'm not sure what race there would be. If we get a normal end_stream set, the stream should finish inline and not be lingering around to pick up and I'd think any subsequent disconnects wouldn't be registered?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting test failures if I don't update stream info in the destructor. Might be a test issue again. Is it possible to get onEvent(RemoteClose) once stream is scheduled to be defer-deleted? If not, then it is likely just another unrealistic test.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'd guess it's the tests being badly ordered but you should be able to check with gdb or envoy logs.

Copy link
Copy Markdown
Contributor Author

@kyessenov kyessenov Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean with the race is the following sequence:

  1. Socket is read ready, data read, calling onData() and and encodeHeaders() in HCM.
  2. encodeHeaders() ends stream, triggers log().
  3. Socket is remote-closed.

It seems that we need to process the file events before triggering log() but I'm not sure how to hold onData() for that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, in that case in production I think the details would basically be set on encodeHeaders (like the router will when proxying from upstream, or the HCM will when sendLocalReply is called) so maybe switch to that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. That works indeed.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov kyessenov marked this pull request as ready for review August 28, 2020 16:14
@kyessenov
Copy link
Copy Markdown
Contributor Author

Windows failure looks like a flake in ext_authz test. Can someone re-kick windows CI?

@snowp
Copy link
Copy Markdown
Contributor

snowp commented Aug 28, 2020

/retest

should work for Azure pipelines now

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines, to retry CircleCI checks, use /retest-circle.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #12850 (comment) was created by @snowp.

see: more, trace.

Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!
Tagging Matt for cross-company approval and cc @snowp as this might help the HCM ownership tangle (at least would in the original PR)

@alyssawilk
Copy link
Copy Markdown
Contributor

Oh actually already tagged for Snow, so I'll leave that as is :-P

Copy link
Copy Markdown
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@snowp snowp merged commit 4f98ea6 into envoyproxy:master Aug 31, 2020
clarakosi pushed a commit to clarakosi/envoy that referenced this pull request Sep 3, 2020
call log() before onDestroy(), as well as populate final stream info details at the time of log() rather than the deferred destruction.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Clara Andrew-Wani <candrewwani@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants