worker: provide removeFilterChain interface by lambdai · Pull Request #10528 · envoyproxy/envoy

lambdai · 2020-03-26T00:26:44Z

Description:
Follow up on #10491

By adding removeFilterChain and a parameter in addListenerToWorker,
Now Worker and ConnectionHandler provides the functionalities required
by the future ListenerManager and ListenerImpl for intelligent listener update path.

The background is #10491 enabled shared filter chains among ListenerConfig, and this PR
is how ConnectionHandler can manipulate a tcp listener at filter chain granular.

A typical flow is

The ListenerConfig with new filter chain set replaces the previous one through addListenerToWorker, now the new connections will see the new filter chain set.
The filter chains need to be destroyed in calculated and start to be drained
At the drain timeout, the filter chains are force destroyed through removeFilterChain
Finally, the old ListenerConfig is destroyed.

In this PR the added flow can only be triggered by

ListenerManagerImpl::addListenerToWorkerForTest
ListenerManagerImpl::drainFilterChainsForTest

Those two helper function will be deleted in the next PR when a new shape of ListenerImpl is introduced.

Risk Level: LOW

The newly added removeFilterChain is not used except in test cases.
addListenerToWorker is adding a parameter (optional overridden listener) but the production code always passes nullopt and shim to original code path.

Testing:
Docs Changes:
Release Notes:
[Optional Fixes #Issue]
[Optional Deprecated:]

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123 · 2020-04-07T16:51:43Z

Please let me know when this is ready for review. Also, this is still a huge and complicated PR, so if it can be split further please do. If it can't be split, the description should be updated to be about 10x as long as it is so I understand what this PR does. Thanks!

/wait

lambdai · 2020-04-07T18:17:38Z

@mattklein123 It's ready to review. I added the description of what exactly this PR addresses.

It's still long but it's not that hard to review.
Firstly most of the 1 line fragment change is just reflecting the type change config_ from ref to ptr, and the new overridden_listener
Then the rest is the 2 independent flows: addListener and removeFilterChain
Finally is the DrainingFilterChainsImpl providing the scheduled listener destroy in master thread and the filter chains to be destroyed in workers thread.

mattklein123

Looks good at a high level, thanks. Flushing out a first round of comments. Will take another pass once these are actioned on. Thank you!

/wait

include/envoy/event/deferred_task.h

include/envoy/network/connection_handler.h

include/envoy/network/filter.h

mattklein123 · 2020-04-09T23:55:09Z

source/server/connection_handler_impl.cc

+          return;
+        }
+      }
+      ASSERT(false, absl::StrCat("fail to replace tcp listener ", config.name(), " tagged ",


I would replace with NOT_REACHED if the code should actually not reach here. Should this ever hit?

I am not sure... I was thinking about the listener update on top of a listener socket bind error, where worker cannot listen.
I feel the removal and listener update is not ordered.
WDYT?

mattklein123 · 2020-04-10T00:07:41Z

source/server/connection_handler_impl.cc

+      }
+      // since is_deleting_ is on, we need to manually remove the map value and drive the iterator
+
+      // Defer delete connection container to avoid race condition in destroying connection


What race condition? Do you mean a stack unwind issue? Clarify?

Why I feel it a deja vu...
Same reason in ActiveTcpListener::removeConnection(ActiveTcpConnection& connection) ?

The connection is defer deleted

The connection has reference to the connection container
so the connection container need to be defer deleted.

source/server/listener_manager_impl.cc

source/server/listener_manager_impl.h

lambdai · 2020-04-10T01:04:35Z

Thanks! Will update later today

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

lambdai

Thanks! Address most of the comment.
I will update the deferer task class in the round

include/envoy/network/filter.h

lambdai · 2020-04-10T09:13:30Z

source/server/connection_handler_impl.cc

+          return;
+        }
+      }
+      ASSERT(false, absl::StrCat("fail to replace tcp listener ", config.name(), " tagged ",


I am not sure... I was thinking about the listener update on top of a listener socket bind error, where worker cannot listen.
I feel the removal and listener update is not ordered.
WDYT?

lambdai · 2020-04-10T09:25:31Z

source/server/connection_handler_impl.cc

+  // Fallback to iterate over all listeners. The reason is that the target listener might have began
+  // another update and the previous tag is lost.


The drain take time and so there might be 100 removeFilterChains() scheduled, each with its own listener tag(1 to 100)
However, the all 100 removal could refer the same activetcplistener, which now has the listener tag 101.

When filter chains with listener tag 1 is removing in this function, although the listener tag is now 101, we need to remove the filter chains of this listener 101.

Does it make sense?

lambdai · 2020-04-10T09:26:05Z

source/server/connection_handler_impl.cc

+void ConnectionHandlerImpl::ActiveTcpListener::removeFilterChains(
+    const std::list<const Network::FilterChain*>& draining_filter_chains) {
+  // Need to recover the original deleting state
+  bool was_deleting = is_deleting_;


lambdai · 2020-04-10T09:41:12Z

source/server/connection_handler_impl.cc

+      }
+      // since is_deleting_ is on, we need to manually remove the map value and drive the iterator
+
+      // Defer delete connection container to avoid race condition in destroying connection


Why I feel it a deja vu...
Same reason in ActiveTcpListener::removeConnection(ActiveTcpConnection& connection) ?

The connection is defer deleted

The connection has reference to the connection container
so the connection container need to be defer deleted.

source/server/listener_manager_impl.h

lambdai · 2020-04-10T09:50:15Z

source/server/connection_handler_impl.cc

+      while (!connections.empty()) {
+        connections.front()->connection_->close(Network::ConnectionCloseType::NoFlush);
+      }
+      // since is_deleting_ is on, we need to manually remove the map value and drive the iterator


lambdai · 2020-04-10T10:07:30Z

source/server/connection_handler_impl.cc

+void ConnectionHandlerImpl::removeFilterChains(
+    const Network::DrainingFilterChains& draining_filter_chains, std::function<void()> completion) {
+  for (auto& listener : listeners_) {
+    // TODO(lambdai): merge the optimistic path and the pessimistic locking.


The optimistic look up is hoping to locate the listener by tag. The pessimistic path is to go though each listener and execute the removal. The idea is that the listener tag in arg could be a staled value. Upon filter chain removal that listener may have a new listener tag

lambdai · 2020-04-10T10:10:16Z

source/server/connection_handler_impl.cc

+void ConnectionHandlerImpl::removeFilterChains(
+    const Network::DrainingFilterChains& draining_filter_chains, std::function<void()> completion) {
+  for (auto& listener : listeners_) {
+    // TODO(lambdai): merge the optimistic path and the pessimistic locking.


The idea is the filter chain(and the connections) are defer deleted.
We don't want to invoke the completion when connection is merely in defer delete list but destroyed yet.
As a hint, the completion will delete the filter chain and the factory context, then connection will hold a reference to a destroyed factory context.

include/envoy/event/deferred_task.h

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123

Sorry for the delay. In general this LGTM but I have a few more comments/questions. Thank you!

/wait

mattklein123 · 2020-04-15T18:25:59Z

source/server/connection_handler_impl.cc

+  // Fallback to iterate over all listeners. The reason is that the target listener might have began
+  // another update and the previous tag is lost.


Kind of, but this is pretty confusing as I need to page back in how all of the tagging stuff works, when a tag changes, etc. Can you add quite a bit more comments here to see if that helps me? Should the tag perhaps not update if we are doing a filter chain only update? I think that would make things a lot simpler?

source/server/connection_handler_impl.cc

source/server/listener_manager_impl.h

lambdai

@mattklein123 Thank you! I will update the PR.
I replied the comment. Let me if it helps!

lambdai · 2020-04-15T22:43:30Z

source/server/connection_handler_impl.cc

+  // Fallback to iterate over all listeners. The reason is that the target listener might have began
+  // another update and the previous tag is lost.


Could we leave the two loop here for now with a TODO, and we will resolve it in the next PR?
If the listener manager reuses the tag value, the below pessimistic loop won't be hit.
Personally I feel the listener tag is helpful when I triage the listener update.

source/server/listener_manager_impl.h

source/server/connection_handler_impl.cc

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123

Thanks, nice work!

mattklein123 · 2020-04-16T20:33:07Z

source/server/listener_manager_impl.h

  COUNTER(listener_modified)                                                                       \
  COUNTER(listener_removed)                                                                        \
  COUNTER(listener_stopped)                                                                        \
+  GAUGE(total_filter_chains_draining, NeverImport)                                                 \


Please make a note to document this when you do the docs, release notes, etc. in the final change. Thank you!

ack.
Thank you!

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

* worker: provide removeFilterChain interface (envoyproxy#10528) Signed-off-by: Yuchen Dai <silentdai@gmail.com> * cherry-pick in place filter chain update Signed-off-by: Yuchen Dai <silentdai@gmail.com> * fix conficts Signed-off-by: Yuchen Dai <silentdai@gmail.com>

lambdai added 30 commits January 8, 2020 19:59

scaffold of listener inplace update

0eef61c

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

listener_impl: assume new filter chain creator

fdf07aa

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

new creator

8a102ca

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

signature: shared filter chain on build

1f69081

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

intelli listener: connection_handler

266eda8

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

intelli listener: worker

e322c66

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

intelli listener: filter chain manager

5fa27e8

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

add defer delete task impl

629f9d2

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

add prototype of removeUntrackedFilterChains series

68ce67c

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

impl tcp listener removeUntrackedFilterChains

58ec1a4

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

build listener

4a32a60

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

build envoy

5705c39

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

unify add intelligent listener

f0a711d

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

untrack filter chains

3ffac8f

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

integration test

f24e6c8

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

fill diffFilterChain and fix crashes

c95dd82

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

improve ListenerImpl::supportUpdateFilterChain

e2e8011

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

merge master

2a331cd

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

revert BUILD

6c73717

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

RawConnectionDriver could support connection reuse

ad84c6b

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

a working ReloadConfigAddingFilterChain under ipv4

ce3d9c6

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

fix integration test ReloadConfigDeletingFilterChain

4258815

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

comments

eb5109f

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

fix and add filter chain manager test

0a93bb2

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

try removing draindecision from listenerimpl

4d5a142

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

drain manager for draining filter chain group

41da518

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

add conn handler test on remove filter chain

e24a6ed

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

add TcpListenerInplaceUpdate test

cd29d43

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

fix existing listener_manager_impl_test

c63a1c7

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

add test: stop and update inplace listener

3423463

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

see if it helps the coverage

03c5c3f

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123 added the waiting label Apr 6, 2020

Merge remote-tracking branch 'origin/master' into fcd-on-ctx

1ff8cb5

repokitteh-read-only bot removed the waiting label Apr 7, 2020

repokitteh-read-only bot added the waiting label Apr 7, 2020

mattklein123 removed the waiting label Apr 8, 2020

mattklein123 requested changes Apr 10, 2020

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 10, 2020

comment

f57ab53

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

repokitteh-read-only bot removed the waiting label Apr 10, 2020

lambdai commented Apr 10, 2020

View reviewed changes

lambdai added 3 commits April 14, 2020 00:49

move deferred task

5c51f2a

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

move defer task

a5e0234

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

license

ec94d96

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123 requested changes Apr 15, 2020

View reviewed changes

repokitteh-read-only bot added the waiting label Apr 15, 2020

lambdai commented Apr 15, 2020

View reviewed changes

address comment

f1509ab

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

repokitteh-read-only bot removed the waiting label Apr 16, 2020

lambdai added 2 commits April 16, 2020 04:11

Merge remote-tracking branch 'origin/master' into fcd-on-ctx

6faa620

rename to deferredRemoveFilterChains

6219b92

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

mattklein123 approved these changes Apr 16, 2020

View reviewed changes

mattklein123 merged commit ce5953b into envoyproxy:master Apr 16, 2020

lambdai mentioned this pull request Apr 22, 2020

Implement one single outbound listener istio/istio#22217

Closed

lambdai added a commit to lambdai/envoy-dai that referenced this pull request May 1, 2020

worker: provide removeFilterChain interface (envoyproxy#10528)

624a7b4

Signed-off-by: Yuchen Dai <silentdai@gmail.com>

lambdai mentioned this pull request May 5, 2020

Backport listener in place update istio/envoy#205

Merged

		// Fallback to iterate over all listeners. The reason is that the target listener might have began
		// another update and the previous tag is lost.

Conversation

lambdai commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattklein123 commented Apr 7, 2020

Uh oh!

lambdai commented Apr 7, 2020

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lambdai commented Apr 10, 2020

Uh oh!

lambdai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lambdai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lambdai commented Mar 26, 2020 •

edited

Loading