Skip to content

v1.8 backports 2020-07-23#12627

Merged
christarazi merged 42 commits intov1.8from
pr/v1.8-backport-2020-07-23
Jul 23, 2020
Merged

v1.8 backports 2020-07-23#12627
christarazi merged 42 commits intov1.8from
pr/v1.8-backport-2020-07-23

Conversation

@pchaigno
Copy link
Copy Markdown
Member

@pchaigno pchaigno commented Jul 23, 2020

Once this PR is merged, you can update the PR labels via:

$ for pr in 12589 12568 12607 12556 12615 12618 12623 12605 12606; do contrib/backporting/set-labels.py $pr done 1.8; done

pchaigno and others added 30 commits July 23, 2020 16:13
[ upstream commit b9496e1 ]

7796070 removed bpf_lb.o and any LB state transfers across nodes for
DSR.

Fixes: 7796070 ("bpf: Remove bpf_lb.c")
Reported-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 7930f74 ]

This small change should clarify that the identity transfer in overlay
mode is an optimization and not a requirement. We have had several users
confused by this paragraph asking how security identity was obtained in
direct routing mode.

Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 0591d63 ]

Adding the "datapath" to --debug-verbose enables debug messages produced
by the datapath.

Previously, it could have been enabled for new endpoints by invoking
"cilium config Debug{,LB}=Enabled" on each node, and for existing -
by "cilium config $EID Debug{,LB}=Enabled".

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 1173f18 ]

Add a note to the upgrade guide how to enable datapath debug messages
which were disabled by [1] (released in v1.8.0).

[1]: 583e67c

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 599c3af ]

In certain environments, the argument list might become large enough so
`make bench` or `make bench-privileged` fail with:

  make: execvp: /bin/bash: Argument list too long

Follow the approach from commit cfa665d ("Explicitly break
unit-tests foreach into for loop.") for the `bench` and
`bench-privileged` targets as well to fix this.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 46818da ]

This is a first step that will allow implementing unit tests for the
relay package. A new interface, `peerSyncer`, is defined and implemented
by a `syncer` struct. Having the interface will allow creating a mock
for testing purposes.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 207da71 ]

This provides a better abstraction as the client that calls `List()` is
expected to receive a list of peers that are "ready" to connect to and
does not have to manage re-connect.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 5d9eec6 ]

This commit is an in-depth reimplementation of the peers
synchronization/connections handling logic. All logic regarding
management of peers and connections to them has been moved to a separate
package with a clearly defined interface for the consumer
(`pool.PeerManager`).

The goal here is threefold:

  1. Separation of concerns (by hiding all internal peer handling logic
     into a dedicated package)
  2. Increased testability
  3. Improved connections handling

The consumer of the `pool` package does not have to care whether a peer
is connected or not anymore. If a peer is not connected, it's `Conn`
attribute will be `<nil>`. If `Conn` is not `<nil>` but the peer cannot
be reached, the consumer can signal it by using the manager's
`ReportOffline(name string)` method.

The new `pool.Manager` (which implements the `pool.PeerManager`
interface) also implements a better mechanism to ensure peers connection
are healthy by using a dedicated routine that monitors for
offline/reported offline peers and attempts to (re-)connect to them.
This improves the situation as with the older implementation,
re-connecting to a peer required a new call to the Hubble Relay service
(and the peers that were "woken up" were not available for the request).

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit e4e5fcd ]

Such interface are useful to create good abstractions with code that
require a Peer service client. They can also easily be mocked which is
essential to write good unit tests.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit c61fb77 ]

LocalClientBuilder creates a new Client which is suitable when
the gRPC connection to the Peer service is local (typically a Unix
Domain Socket).
This scenario is the most common one and is directly useful to Hubble
Relay.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 5168499 ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit f2a8f31 ]

This commits defines an interface for a gRPC client connection builder.
When instantiating a new pool.Manager, the caller may specify a grpc
client connection builder implementation to be used to create connection
to Hubble peers. This allows for fine grained control over the client
connection parameters by the caller. In addition, as this is an
interface, it can easily be mocked which is useful when writing unit
tests.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 5be8aca ]

With the rework of peer connections handling, a goroutine periodically
checks whether peer connections are healthy or not. This check is
necessary as there are scenarios where a connection can be lost without
any peer service notification or client request (for instance, a node
reboot).

This commit improves reconnection attempts by implementing a backoff
mechanism so attempts are not too frequent. There is a default backoff
mechanism that can be configured by caller when instantiating a new
pool.Manager.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 3120dc1 ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 24f65cc ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 35471ca ]

On peer change notifications or call to
`pool.Manager.ReportOffline(name string)`, the backoff interval should
be ignored to re-attempt a connection. This change typically addresses
the following scenario:

  1. Node Foo is offline
  2. The connectivity checker attempts to reconnect several time; each
    time the backoff is increased.
  3. Node Foo becomes available; a peer change notification is received
  4. Oops, no connection attempt is made to Foo because of a large
     backoff duration

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 08e7f3d ]

This is not necessary as the connect function closes pre-existing
connections.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 2064bdc ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit bee30fa ]

This is necessary to avoid import cycles when implementing fakes for the
peer interfaces in the `testutils` package.

Note that the `types` sub-package is commonly found throughout Cilium's
codebase so this change follows the convention.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit df12d9e ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 304bced ]

This will allow mocking gRPC clients which is useful for unit testing.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 7b42b5f ]

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 065ec3d ]

These unit tests are not exhaustive although they still provide a decent
coverage. However, it provides a good structure to implement more tests
in the future such as complex peer connectivity issue scenarios or
simply regression tests for bugs that are fixed.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 7b7845a ]

This commit removes `Target()` from the peer `ClientBuilder` interface
and, instead, adds the target as a parameter to `Client()`.

As a consequence, Hubble Relay code is adapted to the change. The peer
pool manager now has a new option to provide the address of the peer
gRPC service.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 029ca1f ]

The debug option was only used to set the log level of the logger. As
the caller now has control over the logger to be used, this option is no
longer necessary.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 4297a79 ]

This change makes tests more reliable and is cleaner at the same time as
once the function returns, the pool manager is really shut down.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 4d557d9 ]

This commit tests the pool manager logger output for certain tests. This
change revealed some flakiness in the tests that have been addressed.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 6404664 ]

The peers map is expected to be more frequently read than written to,
especially with the `List()` method being exposed and used by Hubble
Relay for every request it handles. Therefore, a RWLock is more
approriate than a regular lock.

Suggested-by: André Martins <andre@cilium.io>
Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 9fdcca2 ]

The name `pool` clashes with the package `pool`. It is not a problem per
se but creates a bit of confusion so rename it to `pm`.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 37af0ea ]

The `access-log` option was removed in 1.8, so move it from the
"Deprecated options" section to the correct "Removed options" section in
the upgrade guide.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Paul Chaignon <paul@cilium.io>
tgraf and others added 7 commits July 23, 2020 16:15
[ upstream commit 794c34b ]

Commit 19ad311 incorrectly changed the return value to nil instead of
returning an error.

Fixes: 19ad311 ("kvstore: Fix Watch() to return when client is closed or context is cancelled")
Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit bf8e432 ]

The firstSession channel has been used to allow blocking until the initial
session has been established. However, because callers have not been handling
errors, it was not possible to close the channel if the etcd client was shut
down before the session was ever established and thus the channel was leaked.

This was even more problematic for etcd operations performed without a context
with a timeout as most operations currently block on firstSession or context.

Consolidate the errChan and firstSession as they effectively served the same
purpose. The only slight change is that the firstSession channel is closed
*after* the etcd version has been verified which is the right behavior anyway.

Having the error condition returned via the firstSession channel allows to
return it to owning controllers as well when renewing sessions for better
visibility.

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit e959c9f ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 589bdf3 ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 02a1810 ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit 0262854 ]

The context passed into NewSession() was supposed to enforce a timeout on the
NewSession() operation which is triggering a Grant() and KeepAlive()
instruction. However, the context passed into NewSession() will also be
associated with the resulting lease. As per etcd documentation, this will
result in:

> If the context is canceled before Close() completes, the session's lease will
> be abandoned and left to expire instead of being revoked.

Because of this, any session renewal triggering a new session would create a
session that is immediately closed again due to the context passed into
NewSession being cancelled when the controller run ends successfully. This
resulted in any renewed session to have an effective lifetime of 10
milliseconds before requiring renewal again.

Fixes: #12619
Fixes: 8524fca ("kvstore: Add session renew backoff")

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Paul Chaignon <paul@cilium.io>
[ upstream commit fff58ef ]

Follow-up to #12305, where we raised the timeout from 100ms to 500ms.
This seemed to suppress most of the flakes reported in #12042, but we
saw one again recently: Try restoring the timeout value to its original
value of 1 second.

Most of the time the RTT time for the exchange is way below 100ms anyway
and we won't have a difference on tests duration. In the worst and very
unlikely case where all DNS TCP exchanges are super-slow, we only have 5
exchanges in the tests and cannot spend more than a total 5 seconds on
them (or one would timeout and the test fail).

Fixes: #12042
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno pchaigno requested a review from a team as a code owner July 23, 2020 14:18
@pchaigno pchaigno added backport/1.8 kind/backports This PR provides functionality previously merged into master. labels Jul 23, 2020
Copy link
Copy Markdown
Member

@qmonnet qmonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • #12606 -- fqdn/dnsproxy/proxy_test: increase again timeout for DNS TCP exchanges (@qmonnet)

Looks good 👍

Copy link
Copy Markdown
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for my changes

Copy link
Copy Markdown
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my changes as well 👍

Copy link
Copy Markdown
Member

@brb brb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my changes.

As per the discussions from #12493,
we should not increment the v1.8 schema version to 1.22 as this would
conflict with early currently v1.9 cycle. Fix it to 1.21.1 instead.

Fixes: 336791f ("Fix small CRD issue with toGroups")
Signed-off-by: André Martins <andre@cilium.io>
@pchaigno
Copy link
Copy Markdown
Member Author

test-backport-1.8

@christarazi christarazi self-assigned this Jul 23, 2020
@aanm
Copy link
Copy Markdown
Member

aanm commented Jul 23, 2020

test-upstream-k8s

@pchaigno pchaigno removed the request for review from tgraf July 23, 2020 21:09
@maintainer-s-little-helper maintainer-s-little-helper Bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jul 23, 2020
@christarazi christarazi merged commit 2c595e3 into v1.8 Jul 23, 2020
@christarazi christarazi deleted the pr/v1.8-backport-2020-07-23 branch July 23, 2020 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/backports This PR provides functionality previously merged into master. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants