worker: provide removeFilterChain interface#10528
worker: provide removeFilterChain interface#10528mattklein123 merged 72 commits intoenvoyproxy:masterfrom
Conversation
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
|
Please let me know when this is ready for review. Also, this is still a huge and complicated PR, so if it can be split further please do. If it can't be split, the description should be updated to be about 10x as long as it is so I understand what this PR does. Thanks! /wait |
|
@mattklein123 It's ready to review. I added the description of what exactly this PR addresses. It's still long but it's not that hard to review. |
mattklein123
left a comment
There was a problem hiding this comment.
Looks good at a high level, thanks. Flushing out a first round of comments. Will take another pass once these are actioned on. Thank you!
/wait
| return; | ||
| } | ||
| } | ||
| ASSERT(false, absl::StrCat("fail to replace tcp listener ", config.name(), " tagged ", |
There was a problem hiding this comment.
I would replace with NOT_REACHED if the code should actually not reach here. Should this ever hit?
There was a problem hiding this comment.
I am not sure... I was thinking about the listener update on top of a listener socket bind error, where worker cannot listen.
I feel the removal and listener update is not ordered.
WDYT?
| } | ||
| // since is_deleting_ is on, we need to manually remove the map value and drive the iterator | ||
|
|
||
| // Defer delete connection container to avoid race condition in destroying connection |
There was a problem hiding this comment.
What race condition? Do you mean a stack unwind issue? Clarify?
There was a problem hiding this comment.
Why I feel it a deja vu...
Same reason in ActiveTcpListener::removeConnection(ActiveTcpConnection& connection) ?
- The connection is defer deleted
- The connection has reference to the connection container
so the connection container need to be defer deleted.
|
Thanks! Will update later today |
lambdai
left a comment
There was a problem hiding this comment.
Thanks! Address most of the comment.
I will update the deferer task class in the round
| return; | ||
| } | ||
| } | ||
| ASSERT(false, absl::StrCat("fail to replace tcp listener ", config.name(), " tagged ", |
There was a problem hiding this comment.
I am not sure... I was thinking about the listener update on top of a listener socket bind error, where worker cannot listen.
I feel the removal and listener update is not ordered.
WDYT?
| // Fallback to iterate over all listeners. The reason is that the target listener might have began | ||
| // another update and the previous tag is lost. |
There was a problem hiding this comment.
The drain take time and so there might be 100 removeFilterChains() scheduled, each with its own listener tag(1 to 100)
However, the all 100 removal could refer the same activetcplistener, which now has the listener tag 101.
When filter chains with listener tag 1 is removing in this function, although the listener tag is now 101, we need to remove the filter chains of this listener 101.
Does it make sense?
| void ConnectionHandlerImpl::ActiveTcpListener::removeFilterChains( | ||
| const std::list<const Network::FilterChain*>& draining_filter_chains) { | ||
| // Need to recover the original deleting state | ||
| bool was_deleting = is_deleting_; |
| } | ||
| // since is_deleting_ is on, we need to manually remove the map value and drive the iterator | ||
|
|
||
| // Defer delete connection container to avoid race condition in destroying connection |
There was a problem hiding this comment.
Why I feel it a deja vu...
Same reason in ActiveTcpListener::removeConnection(ActiveTcpConnection& connection) ?
- The connection is defer deleted
- The connection has reference to the connection container
so the connection container need to be defer deleted.
| while (!connections.empty()) { | ||
| connections.front()->connection_->close(Network::ConnectionCloseType::NoFlush); | ||
| } | ||
| // since is_deleting_ is on, we need to manually remove the map value and drive the iterator |
| void ConnectionHandlerImpl::removeFilterChains( | ||
| const Network::DrainingFilterChains& draining_filter_chains, std::function<void()> completion) { | ||
| for (auto& listener : listeners_) { | ||
| // TODO(lambdai): merge the optimistic path and the pessimistic locking. |
There was a problem hiding this comment.
The optimistic look up is hoping to locate the listener by tag. The pessimistic path is to go though each listener and execute the removal. The idea is that the listener tag in arg could be a staled value. Upon filter chain removal that listener may have a new listener tag
| void ConnectionHandlerImpl::removeFilterChains( | ||
| const Network::DrainingFilterChains& draining_filter_chains, std::function<void()> completion) { | ||
| for (auto& listener : listeners_) { | ||
| // TODO(lambdai): merge the optimistic path and the pessimistic locking. |
There was a problem hiding this comment.
The idea is the filter chain(and the connections) are defer deleted.
We don't want to invoke the completion when connection is merely in defer delete list but destroyed yet.
As a hint, the completion will delete the filter chain and the factory context, then connection will hold a reference to a destroyed factory context.
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
mattklein123
left a comment
There was a problem hiding this comment.
Sorry for the delay. In general this LGTM but I have a few more comments/questions. Thank you!
/wait
| // Fallback to iterate over all listeners. The reason is that the target listener might have began | ||
| // another update and the previous tag is lost. |
There was a problem hiding this comment.
Kind of, but this is pretty confusing as I need to page back in how all of the tagging stuff works, when a tag changes, etc. Can you add quite a bit more comments here to see if that helps me? Should the tag perhaps not update if we are doing a filter chain only update? I think that would make things a lot simpler?
lambdai
left a comment
There was a problem hiding this comment.
@mattklein123 Thank you! I will update the PR.
I replied the comment. Let me if it helps!
| // Fallback to iterate over all listeners. The reason is that the target listener might have began | ||
| // another update and the previous tag is lost. |
There was a problem hiding this comment.
Could we leave the two loop here for now with a TODO, and we will resolve it in the next PR?
If the listener manager reuses the tag value, the below pessimistic loop won't be hit.
Personally I feel the listener tag is helpful when I triage the listener update.
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
| COUNTER(listener_modified) \ | ||
| COUNTER(listener_removed) \ | ||
| COUNTER(listener_stopped) \ | ||
| GAUGE(total_filter_chains_draining, NeverImport) \ |
There was a problem hiding this comment.
Please make a note to document this when you do the docs, release notes, etc. in the final change. Thank you!
Signed-off-by: Yuchen Dai <silentdai@gmail.com>
* worker: provide removeFilterChain interface (envoyproxy#10528) Signed-off-by: Yuchen Dai <silentdai@gmail.com> * cherry-pick in place filter chain update Signed-off-by: Yuchen Dai <silentdai@gmail.com> * fix conficts Signed-off-by: Yuchen Dai <silentdai@gmail.com>
Description:
Follow up on #10491
By adding
removeFilterChainand a parameter inaddListenerToWorker,Now
WorkerandConnectionHandlerprovides the functionalities requiredby the future ListenerManager and ListenerImpl for intelligent listener update path.
The background is #10491 enabled shared filter chains among ListenerConfig, and this PR
is how ConnectionHandler can manipulate a tcp listener at filter chain granular.
A typical flow is
addListenerToWorker, now the new connections will see the new filter chain set.removeFilterChainIn this PR the added flow can only be triggered by
Those two helper function will be deleted in the next PR when a new shape of ListenerImpl is introduced.
Risk Level: LOW
Testing:
Docs Changes:
Release Notes:
[Optional Fixes #Issue]
[Optional Deprecated:]