Skip to content

http: fixing upstream stream limit bug#14820

Merged
alyssawilk merged 19 commits intoenvoyproxy:mainfrom
alyssawilk:stream_limit
Mar 9, 2021
Merged

http: fixing upstream stream limit bug#14820
alyssawilk merged 19 commits intoenvoyproxy:mainfrom
alyssawilk:stream_limit

Conversation

@alyssawilk
Copy link
Copy Markdown
Contributor

Fixing a bug where the Envoy upstream logic didn't respect stream limits capped by upstream.

Risk Level: Medium (it's complicated code)
Testing: new unit, integration tests
Docs Changes: n/a
Release Notes: inline
Runtime guard: envoy.reloadable_features.improved_stream_limit_handling
Fixes #13933

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial nits. Will review the rest later.

};

/**
* A class for sharing what HTTP/1 SETTINGS were received from the peer.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP/2

//
// Note that if more HTTP/2 streams have been established than are allowed by
// a late-received SETTINGS frame, this MAY BE NEGATIVE.
int64_t connecting_stream_capacity_{};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of gotcha here. The reason checkAnd... methods work correctly is because connecting_stream_capacity_ is 64 bit. If it was 32 bit the arithmetic would be unsigned and ASSERTs would fire. Not sure what to do here. Maybe leave a comment that it needs to be 64 bit. Up to you.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is easy to overflow capacity when talking to a server that sets the number of streams to UINT32_MAX and the proxy ends up creating multiple upstream connections for that cluster.

@yanavlasov yanavlasov self-assigned this Jan 26, 2021
/**
* A class for sharing what HTTP/1 SETTINGS were received from the peer.
*/
class ReceivedSettings {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not look like you are actually gaining anything by having this interface. In tests you can just construct ReceivedSettingImpl. Might be worthwhile to make it a concrete class without the need for the interface, since it only carries state and no behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm not sure what you mean by this. The codec interface onSettings needs some way to describe what is passed. Are you suggesting we forward declare ReceivedSettingsImpl there? That doesn't seem consistent with general Envoy API policy.

// Returns the number of active streams on this connection.
virtual uint32_t numActiveStreams() const PURE;

virtual bool hadNegativeDeltaOnStreamClosed() { return false; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe leave a comment that h2 pool overrides this and h1 and tcp clients can not have negative concurrent stream count.

@yanavlasov
Copy link
Copy Markdown
Contributor

Thanks, integration test explains this change really well.
/wait

Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this important fix to handling of H2 SETTINGs on upstream connections and sorry for delay in review.


// Settings frame reducing capacity to one stream per connection results in -1 capacity.
NiceMock<MockReceivedSettings> settings;
settings.max_concurrent_streams_ = 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would consider:
MockReceivedSettings settings;
EXPECT_CALL(settings, max_concurrent_streams()).WillOnce(Return(1));

But use of max_concurrent_streams_ seems fine too. I'm not seeing tests that don't set max_concurrent_streams_ and exercise the case where the max_concurrent_streams setting is missing.


template <class T> void checkAndIncrement(T& value, uint32_t delta) {
ASSERT(std::numeric_limits<T>::max() - value > delta);
ASSERT(value + delta >= value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These overflow checks may depend on undefined overflow behavior if T is a signed integer type. ASAN may complain if a test were to trigger the ASSERT. Consider instead:

ASSERT(std::numeric_limits<T>::max() - delta > value);

}
void decrConnectingStreamCapacity(uint32_t delta) {
checkAndDecrement<uint64_t>(connecting_stream_capacity_, delta);
checkAndDecrement<int64_t>(connecting_stream_capacity_, delta);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: You can omit the template argument on this call. The compiler will deduce the right template argument from the type of the first argument.

checkAndDecrement(connecting_stream_capacity_, delta);


class ReceivedSettingsImpl : public ReceivedSettings {
public:
ReceivedSettingsImpl(const nghttp2_settings& settings);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider marking this constructor explicit.

}
bool hasPendingStreams() const { return !pending_streams_.empty(); }

void decrClusterStreamCapacity(int32_t delta) { state_.decrConnectingStreamCapacity(delta); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signed to unsigned conversion since decrConnectingStreamCapacity takes an uint32_t. Should this method take a uint32_t delta?

// Settings frame reducing capacity to one stream per connection results in -1 capacity.
NiceMock<MockReceivedSettings> settings;
settings.max_concurrent_streams_ = 1;
test_clients_[0].codec_client_->onSettings(settings);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify that a second onSettings call with max_concurrent_streams_ == 1; does not decrease capacity a second time.
Coverage for the case where settings increases max_concurrent_streams_; either above the max concurrency configured at Envoy or the limit set by a previous SETTINGs frame on the same upstream connection.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @htuch
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #14820 was synchronize by alyssawilk.

see: more, trace.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@yanavlasov
Copy link
Copy Markdown
Contributor

It needs main merge to unblock CI.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Copy Markdown
Contributor Author

/wait

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Copy Markdown
Contributor Author

I believe comments are addressed. PTAL when you get a chance!

@alyssawilk alyssawilk force-pushed the stream_limit branch 2 times, most recently from 7564935 to 4965ca4 Compare February 22, 2021 15:19
yanavlasov
yanavlasov previously approved these changes Feb 24, 2021
@alyssawilk
Copy link
Copy Markdown
Contributor Author

@htuch can I get a fresh API stamp?

Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, just some minor comments.


template <class T> void checkAndDecrement(T& value, uint32_t delta) {
ASSERT(delta <= value);
ASSERT(value - delta <= value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for my comment is that signed integer overflow is undefined behavior according to the C++ standard. The ASSERT condition you have is only well defined for unsigned integer types.

connecting_stream_capacity_ has a signed integer type.

bool hadNegativeDeltaOnStreamClosed() override {
int ret = negative_capacity_ != 0;
if (negative_capacity_ > 0) {
negative_capacity_--;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither the comment in the base class method nor the function name implies that this method has side effects. Would it be possible to document this interface method?

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
antoniovicente
antoniovicente previously approved these changes Mar 4, 2021
Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk alyssawilk merged commit 06d0780 into envoyproxy:main Mar 9, 2021
yanavlasov added a commit to yanavlasov/envoy that referenced this pull request Jul 19, 2021
Fixing a bug where the Envoy upstream logic didn't respect stream limits capped by upstream.
Backport of 06d0780 into v1.17

Signed-off-by: Yan Avlasov <yavlasov@google.com>
@alyssawilk alyssawilk deleted the stream_limit branch August 18, 2021 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Envoy ignores H/2 max concurrent streams advertised by peers

4 participants