Skip to content

event: Reduce potential for lock contention while executing dispatcher post callbacks.#14289

Merged
antoniovicente merged 9 commits intoenvoyproxy:masterfrom
antoniovicente:deferred_delete_locking
Dec 9, 2020
Merged

event: Reduce potential for lock contention while executing dispatcher post callbacks.#14289
antoniovicente merged 9 commits intoenvoyproxy:masterfrom
antoniovicente:deferred_delete_locking

Conversation

@antoniovicente
Copy link
Copy Markdown
Contributor

Risk Level: low, minor code refactor
Testing: n/a, minimal functional changes expected.
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]

…r post callbacks.

Signed-off-by: Antonio Vicente <avd@google.com>
@jmarantz jmarantz self-assigned this Dec 4, 2020
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great; just a couple of nits. Did we wind up finding any evidence of contention?

if (post_callbacks_.empty()) {
return;
}
callbacks = std::move(post_callbacks_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think std::move defined clearly what state post_callbacks_ would be in after this statement. So I was thinking maybe swap would be better. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post_callbacks_ is a list, so I think this is well defined. I could add post_callbacks_.clear(); after the move.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that seems like it would de-risk it. From https://en.cppreference.com/w/cpp/utility/move

Unless otherwise specified, all standard library objects that have been moved from are placed in a valid but unspecified state. That is, only the functions without preconditions, such as the assignment operator, can be safely used on the object after it was moved from: ... str.clear(); // OK, clear() has no preconditions

is that better than swap? I'm fine either way.

I did not see anything about std::list's move semantics in https://en.cppreference.com/w/cpp/container/list

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::swap is defined as 3 move assignments. Added the clear and a comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the clear(), I think it would be better to ASSERT(post_callbacks_.empty()). Then if something gets messed up and it somehow became a copy instead of a move (due to c++-shenanigans or something) we'll notice quickly, instead of callbacks not getting run.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just jumping in here... what about:

std::list<std::function<void()>> callbacks = [this]() {
  Thread::LockGuard lock(post_lock_);
  return std::exchange(post_callbacks_, {});
}();

or some variation using std::exchange. The pattern of using exchange is mentioned in reference to an event Dispatcher class in an article on fluentcpp.com.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the article. I think the most relevant section is the "Why not just move?". The issue of expressing the "empty after move" constraint is addressed here by means of the ASSERT(post_callbacks_,empty());

}
// It is important that the execution and deletion of the callback happen while post_lock_ is not
// held. Either the invocation or destructor of the callback can call post() on this dispatcher.
while (!callbacks.empty()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some reason not to write this as:

for (auto& callback : callbacks) {
  callback();
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Differences in deletion order for the callback objects.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. add that to the comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment and test.

I also wanted to test yield behavior, but it turns out apparently ordering of schedulable callback scheduling is not deterministic so the yield behavior is not testable.

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
jmarantz
jmarantz previously approved these changes Dec 5, 2020
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@envoyproxy/senior-maintainers ptal

…ocking

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change; much better to not lock/unlock the mutex for each callback separately.

if (post_callbacks_.empty()) {
return;
}
callbacks = std::move(post_callbacks_);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the clear(), I think it would be better to ASSERT(post_callbacks_.empty()). Then if something gets messed up and it somehow became a copy instead of a move (due to c++-shenanigans or something) we'll notice quickly, instead of callbacks not getting run.

// callbacks execute. Callbacks added after this transfer will re-arm post_cb_ and will execute
// later in the event loop.
Thread::LockGuard lock(post_lock_);
if (post_callbacks_.empty()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this case can be deleted; the function behaves correctly without it AFAICT.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I was thinking of it as a micro optimization but didn't realize that most post_callbacks_ is usually non-empty when we get here.

@ggreenway
Copy link
Copy Markdown
Member

/wait

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed the behavior that if a posted callback post another callback, it will be in next event cycle instead of the current one. Is that intended?

// It is important that the execution and deletion of the callback happen while post_lock_ is not
// held. Either the invocation or destructor of the callback can call post() on this dispatcher.
while (!callbacks.empty()) {
auto& callback = callbacks.front();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be in oneline: callbacks.front().callback()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one line version would be:

callbacks.front()();

or

callbacks.front().operator();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, either way, just style nit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

while (!callbacks.empty()) {
auto& callback = callbacks.front();
callback();
// pop the front so that the destructor of the callback that just executed runs before the next
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: capitalize Pop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Signed-off-by: Antonio Vicente <avd@google.com>
Copy link
Copy Markdown
Contributor Author

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed the behavior that if a posted callback post another callback, it will be in next event cycle instead of the current one. Is that intended?

Yes, there is a change in post loop behavior. The loop yields after the group of callbacks present is executed. Post callbacks added by the first set of post callbacks execute in the same event loop iteration (thanks to post_cb_->scheduleCallbackCurrentIteration()) but after other loop events have a chance to execute.

This change in behavior is desirable. I wanted to add a test to cover this behavior but I ran into an issue with ordering of post callbacks vs schedulable callbacks. I thought order was deterministic but apparently it isn't. The yield behavior will be testable as a consequence of #14293

// It is important that the execution and deletion of the callback happen while post_lock_ is not
// held. Either the invocation or destructor of the callback can call post() on this dispatcher.
while (!callbacks.empty()) {
auto& callback = callbacks.front();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the one line version would be:

callbacks.front()();

or

callbacks.front().operator();

while (!callbacks.empty()) {
auto& callback = callbacks.front();
callback();
// pop the front so that the destructor of the callback that just executed runs before the next
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

…ocking

Signed-off-by: Antonio Vicente <avd@google.com>
Signed-off-by: Antonio Vicente <avd@google.com>
ggreenway
ggreenway previously approved these changes Dec 8, 2020
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please wait until @lizan responds before merging.

lizan
lizan previously approved these changes Dec 8, 2020
Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to double-check the behavior change.

No strong preference on the nit, up to you.

// It is important that the execution and deletion of the callback happen while post_lock_ is not
// held. Either the invocation or destructor of the callback can call post() on this dispatcher.
while (!callbacks.empty()) {
auto& callback = callbacks.front();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, either way, just style nit.

Signed-off-by: Antonio Vicente <avd@google.com>
@antoniovicente antoniovicente dismissed stale reviews from lizan and ggreenway via ab34d08 December 9, 2020 00:01
@antoniovicente antoniovicente merged commit bed0262 into envoyproxy:master Dec 9, 2020
mpuncel added a commit to mpuncel/envoy that referenced this pull request Dec 11, 2020
* master:
  buffer: Optimize the layout of Slices in Buffer::OwnedImpl by removing subclassing and storing slice info directly in the SliceDeque (envoyproxy#14282)
  gRPC client to be used by ext_proc filter (envoyproxy#14283)
  http2: Add integration tests for PRIORITY frame flood mitigation for upstream servers (envoyproxy#14328)
  event: touch watchdog before execution of each post callback and before deferred deletion (envoyproxy#14339)
  stale: more allowed ops (envoyproxy#14345)
  stale: more changes (envoyproxy#14344)
  test: TODO fixup making enable_half_close private envoyproxy#14330)
  event: Reduce potential for lock contention while executing dispatcher post callbacks. (envoyproxy#14289)
  stale: fix config (envoyproxy#14337)
  metrics service sink: generalize the sink and grpc streamer for external use (envoyproxy#13919)
  wasm: update V8 to v8.8.278.8. (envoyproxy#14298)
  repo: switch to actions based stale bot (envoyproxy#14335)
  buffer: Use WatermarkFactory to create most WatermarkBuffer instances (envoyproxy#14256)

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants