Skip to content

Update fork from envoyproxy/envoy master#1

Merged
larrywest merged 165 commits intolarrywest:masterfrom
envoyproxy:master
Apr 23, 2019
Merged

Update fork from envoyproxy/envoy master#1
larrywest merged 165 commits intolarrywest:masterfrom
envoyproxy:master

Conversation

@larrywest
Copy link
Copy Markdown
Owner

Thanks to @KirstieJane for the steps to do this: https://github.com/KirstieJane/STEMMRoleModels/wiki/Syncing-your-fork-to-the-original-repository-via-the-browser

For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md

Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
[Optional Fixes #Issue]
[Optional Deprecated:]

Dan Rosen and others added 30 commits March 22, 2019 15:48
Introduce a new "safe" init manager, to replace the existing one that's prone to use-after-free issues (see e.g. #6116). Users of the existing init manager will be upgraded one-by-one in subsequent PRs if this design is approved. See also previous false starts in PRs #6136 and #6245.

Risk Level: Low, no existing users of the existing init manager are changed in this PR.
Testing: New unit tests added.
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Dan Rosen <mergeconflict@google.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
We need to think about whether we want to have all of these somehow
reference some type of environment variable that would point to the
right image in the context of the tree the user is looking at, but
given that the trunk documentation may require a master build, this
is more correct.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Previously, we incremented rq_total and upstream_rq_total in the HTTP/1
conn pool even if the request ended up being circuit broken. The stats
were not incremented for HTTP/2 requests. This change no longer
increments the stats for HTTP/1 circuit broken requests for consistency
between the two.

Signed-off-by: Spencer Lewis <slewis@squareup.com>
Address one TOTO in that file that (D)CHECK is not explicit listed in platform API, but is supposed to be defined in some impl. Define them in quic_logging_impl.h seems appropriate.

Risk Level: low, not in use
Part of #2557

Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Yuval Kohavi <yuval.kohavi@gmail.com>
Update some documentation comments in api/envoy/service/auth/v2/*.proto to
more accurately describe the *current* behavior (without making any
judgment on whether that behavior is "correct" or desirable).

Signed-off-by: Luke Shumaker <lukeshu@datawire.io>
This filter decodes the ZooKeeper wire protocol and emits
stats & metadata about requests, responses and events.

This wire protocol parsing is based on:

https://github.com/twitter/zktraffic
https://github.com/rgs1/zktraffic-cpp

The actual filter structure is based on the Mysql proxy filter.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
Updating per new file locations. Updates (unused) reloadable flags to default true.

Risk Level: n/a (tooling)
Testing: manual
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Risk Level: n/a
Testing: n/a
Docs Changes: yes
Release Notes: no

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Remove entry from the "initial resource versions" map when the server informs us that the corresponding resource has gone away.

Risk Level: low
#4991

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Maxime Bedard <maxime.bedard@shopify.com>
…_bug_tracker_impl.h QUICHE platform implementation (#6339)

Add quic_expect_bug_impl.h, (spdy|http2)_logging_impl.h, (spdy|http2)_bug_tracker_impl.h QUICHE platform implementation.

All of them depends on quic_logging_impl.h.

Risk Level: minimum, code not used yet.
Testing:
bazel test test/extensions/quic_listeners/quiche/platform:spdy_platform_test --test_output=all --define quiche=enabled
bazel test test/extensions/quic_listeners/quiche/platform:http2_platform_test --test_output=all --define quiche=enabled
bazel test test/extensions/quic_listeners/quiche/platform:quic_platform_test --test_output=all --define quiche=enabled
bazel test @com_googlesource_quiche//:spdy_platform_test --test_output=all --define quiche=enabled
bazel test @com_googlesource_quiche//:http2_platform_test --test_output=all --define quiche=enabled
bazel test @com_googlesource_quiche//:quic_platform_test --test_output=all --define quiche=enabled

Signed-off-by: Bin Wu <wub@google.com>
We want to limit the number of connection pools per cluster. Add it to
the circut breaker thresholds so we can do it per priority.

Signed-off-by: Kyle Larose <kyle@agilicus.com>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Part of #5942

Signed-off-by: Matt Klein <mklein@lyft.com>
Part of #6361

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Michael Rebello <mrebello@lyft.com>
Signed-off-by: Derek Argueta <dereka@pinterest.com>
Signed-off-by: Dan Rosen <mergeconflict@google.com>
…ime. (#6369)

* Rework guarddog_impl.cc using timers rather than condvar timed waits.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
Add QuicFileUtilsImpl using Envoy::FileSystem.

Risk Level: low
Testing: Added tests in test/extensions/quic_listeners/quiche/platform/quic_platform_test.cc and tested with --define quiche=enabled
Part of #2557

Signed-off-by: Dan Zhang <danzh@google.com>
Fixing up a TODO - fitting all route config options simply doesn't scale, so refactoring things so we don't have functions with infinite arguments.

Risk Level: n/a (test only)
Testing: integration test pass
Docs Changes: n/a
Release Notes: n/a
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
…cation for Any and hosts deprecation for load_assignment (#6368)

Update examples for Struct deprecation for Any

Risk Level: Low - generated configs only, no changes to code
Testing: bazel build //configs:example_configs, bazel test //test/...
Docs Changes: None required
Release Notes: None required

Fixes #6025
Replaces #6356
Related #6346

Signed-off-by: Michael Payne <michael@sooper.org>
Signed-off-by: Harvey Tuch <htuch@google.com>
Support google_default in channel credentials configuration. The documentation mentions this option and yet it's ignored.

Risk Level: Low, the option was seemingly useless/unused. If anybody relies on it doing nothing, they can just unset it.

Testing: Tried running my own envoy, seemed to pick up the credentials pointed to be GOOGLE_APPLICATION_CREDENTIALS environment variable.

Signed-off-by: qfel <qfel.pl@gmail.com>
This fixes a bug where hosts that were moved between priorities would
not be included in the hosts_added vector, resulting in crashes if the
same host was moved multiple times when used with active health
checking: if a host was moved between priorities twice, it would first
get removed from the health checker, then on the second move the health
checker would crash as it would attempt to remove a host it didn't know
about.

We fix this by explicitly adding the existing host to the list of added
hosts iff the host was previously in a different priority.

Uncovering this bug lead to the discovery of a bug in the batch updating
done during EDS: std::set_difference assumes that the provided ranges
are both *sorted*, which is not generally true during this update flow.
This meant that the filtering of hosts that were added/removed did not
work correctly, and would produce inconsistent result dependent on the
ordering of the host pointers in the unordered_map.

We fix this by using a standard for loop instead of std::set_difference.
Not only is this more correct, it should also be faster for large sets
as it performs the filtering in O(n) instead of O(n^2).


Signed-off-by: Snow Pettersen <snowp@squareup.com>
The change breaks the existing Redis operation, for example redis-cli -p
[WHATEVER] GET 1 crashes Envoy.

This reverts commit 046e989.

Signed-off-by: Nicolas Flacco <nflacco@lyft.com>
This allows us to move the new runtime APIs over to string_view without taking a string-serialization performance hit.

see https://abseil.io/docs/cpp/guides/container for flat_hash_map being a unordered_map replacement with heterogeneous lookup for string_view.

Risk Level: Medium (swapping the underlying internals of runtime)
Testing: existing tests pass
Docs Changes: no
Release Notes: no
mpuncel and others added 28 commits April 17, 2019 15:14
Add integration tests around HTTP timeouts in the router filter including per try and global timeout.

Risk Level: Low
Testing: integration tests

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Add support for specifying _stale_after timeout as part of ClusterLoadAssignment

Risk Level: Low
Optional Feature that is triggered by the Management Server. Defaults to noop.
Testing: Unit test
Docs Changes: None
Release Notes: None

Fixes #6420

Signed-off-by: Vishal Powar <vishalpowar@google.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
GitHub was complaining that 2.10 was problematic security wise; I don't
think it's an issue in our environment, but this should make the
warnings go away.

Signed-off-by: Harvey Tuch <htuch@google.com>
Created OpenRCA service proto file based on ORCA design

Risk Level: Low

Signed-off-by: Chengyuan Zhang <chengyuanzhang@google.com>
Default behavior remains unchanged: retries will use the runtime parameter
defaulted to 25ms as the base interval and 250ms as the maximum. Allows
routes to customize the base and maximum intervals.

Risk Level: low (no change to default behavior)
Testing: unit tests
Doc Changes: included, plus updated description of back-off algorithm
Release Notes: added

Signed-off-by: Stephan Zuercher <zuercher@gmail.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Chris Paika <paika.christopher@gmail.com>
 @htuch discovered a race condition in my libevent watcher implementation in the process of enabling TSAN for dependencies (#6610). Update libevent to pull in the fix (libevent/libevent#793).

Risk Level: low
Testing: bazel test //test/server:worker_impl_test -c dbg --config=clang-tsan --runs_per_test=1000 (with @htuch's patch applied).

Signed-off-by: Dan Rosen <mergeconflict@google.com>
This change alters the behavior of fault data limiting
by resetting the token bucket to a single token when data
initially starts streaming. This makes sure that data pacing
is as expecting, while still allowing per-second bursting if
the data provider is also bursty.

Signed-off-by: Matt Klein <mklein@lyft.com>
Signed-off-by: Nicolas Flacco <nflacco@lyft.com>
Signed-off-by: Matt Klein <mklein@lyft.com>
Description: add ppc64le badge that links to Jenkins build server
Risk Level: Low - Docs only
Testing: Viewed in browser and through GH markdown viewer
Docs Changes: N/A
Release Notes: support ppc64le CPU architecture
Fixes: #5196

Signed-off-by: Christopher M. Luciano <cmluciano@us.ibm.com>
)

Description: add formatting for the "response code details" string recently added to the StreamInfo (#6530)
Risk Level: low
Testing: unit tests
Docs Changes: updated
Release Notes: updated

Signed-off-by: Elisha Ziskind <eziskind@google.com>
This test waits for the upstream to see a reset which confirms that the
router filter did the right thing when the global timeout is hit.
However since this involves the network, we would occasionally see the
reset after the wait call. Since we were waiting for 0ms we'd get
flakes. 15s is hopefully high enough that the test will succeed
reliably.

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
…otocol spec (#6545)

 realized that, with the unreliable queue implementation copied from SotW xDS, delta xDS could get into a state where Envoy thinks it has subscribed, but the server hasn't heard the subscription, with no way for either to realize the mistake. I fixed that by converting the queue setup to a cleaner "do I currently want to send a request?" with the request's (un)subscriptions only populated immediately before the request is actually sent into gRPC.

While doing that, I further realized there was a problem when a given resource was subscribed then unsubscribed (or reversed), all in between request sends. I made sure Envoy handles that sensibly, and added explicit requirements to the xDS protocol spec to ensure servers will also handle it sensibly.

Added unit tests for those fixes.

Risk Level: low
Testing: added unit tests for bugs uncovered

#4991

Signed-off-by: Fred Douglas <fredlas@google.com>
Signed-off-by: Derek Schaller <dschaller@lyft.com>
Signed-off-by: Bin Wu <wub@google.com>
Signed-off-by: Derek Schaller <dschaller@lyft.com>
This defers starting the per try timeout timer until onRequestComplete
to ensure that it is not started before the global timeout. This ensures
that the per try timeout will not take into account the time spent
reading the downstream, which should be responsibility of the HCM level
timeouts.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
This adds support for modifying the grpc-timeout provided by the
downstream by some offset. This is useful to make sure that Envoy is
able to see timeouts before the gRPC client does, as the client will
cancel the request when the deadline has been exceeded which hides the
timeout from the outlier detector.

Signed-off-by: Snow Pettersen <snowp@squareup.com>
It is no longer needed since Api::Api is plumbed ubiquitiously throughout Envoy's core.

The only user of the factory, QuicThreadImpl, has been modified to take the
Envoy::Thread::ThreadFactory via QuicThreadImpl::setThreadFactory().

Signed-off-by: Andres Guedez <aguedez@google.com>
Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
This PR moves the xds protocol from md to rst.

Risk Level: Low
Testing: N/A
Docs Changes: N/A
Release Notes: N/A

Fixes #6338

Signed-off-by: Rama Chavali <rama.rao@salesforce.com>
* Adds SharedStatNameStorageSet.

Signed-off-by: Joshua Marantz <jmarantz@google.com>
@larrywest larrywest merged commit 989ae8e into larrywest:master Apr 23, 2019
@KirstieJane
Copy link
Copy Markdown

I think those instructions might be the most useful thing I’ve ever written 😂 Glad you found them! 💖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.