Skip to content

conn_pool: track streams across the pool#13684

Merged
alyssawilk merged 9 commits intoenvoyproxy:masterfrom
alyssawilk:next_peekahead
Nov 18, 2020
Merged

conn_pool: track streams across the pool#13684
alyssawilk merged 9 commits intoenvoyproxy:masterfrom
alyssawilk:next_peekahead

Conversation

@alyssawilk
Copy link
Copy Markdown
Contributor

Tracking active, pending, and available capacity for each thread local cluster, for eventual use in cluster-wide prefetch

Risk Level: low
Testing: new unit tests
Docs Changes: n/a
Release Notes: n/a

@alyssawilk
Copy link
Copy Markdown
Contributor Author

I had originally planned to land this with the use in source/common/upstream/cluster_manager_impl.cc but it was a large and fragile enough change I think I'd prefer them to be separate (unless you prefer otherwise)

I really don't like how fragile the stream tracking is, but there's a weak link between the codec's streams and the connection pool's streams with different lifetimes, which I'm not sure how to address better than I have here.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had originally planned to land this with the use in source/common/upstream/cluster_manager_impl.cc but it was a large and fragile enough change I think I'd prefer them to be separate (unless you prefer otherwise)

I really don't like how fragile the stream tracking is, but there's a weak link between the codec's streams and the connection pool's streams with different lifetimes, which I'm not sure how to address better than I have here.

I did a review pass. The main thing that jumped at me that concurrent_stream_limit can be set to uint64_t max which may cause some of the tracking counters to behave erratically.

ASSERT(connecting_capacity_ == 0);
}
void checkAndDecrement(uint32_t& value, uint32_t& delta) {
ASSERT(value - delta <= value);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be equivalent?
ASSERT(delta <= value);

ASSERT(active_streams_ == 0);
ASSERT(connecting_capacity_ == 0);
}
void checkAndDecrement(uint32_t& value, uint32_t& delta) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delta does not need to be a non-const reference. Change to "uint32_t delta"

value -= delta;
}

void incrPendingStreams(uint32_t delta) { pending_streams_ += delta; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a mismatch in inc/dec could result in some of these counters overflowing because of missing decrements. I think that the protection you have against this is the ASSERTs that require counters to be 0 at destruction time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the decrement asserts, which caught a lot of issues earlier on.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we ASSERT that we don't overflow on increment also?

client->effectiveConcurrentStreamLimit());
ASSERT(client->real_host_description_);
// Increase the connecting capacity to reflect the streams this connection can serve.
state_.incrConnectingCapacity(client->effectiveConcurrentStreamLimit());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a limit on the configured concurrency per H2 connection? These uint32 counters could overflow if the configured concurrency limit is close to uint32 max and we end up creating multiple connections to backends for the cluster. Note that effectiveConcurrentStreamLimit can return uint64_t max when concurrency limit is set to 0. See

concurrent_stream_limit_(translateZeroToUnlimited(concurrent_stream_limit)),

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in practice these are set by config, and

google.protobuf.UInt32Value max_concurrent_streams = 2
[(validate.rules).uint32 = {lte: 2147483647 gte: 1}];

so they're functionally capped at uint32. Would you prefer I make all the base class types uint32_t, and the new types uint64 for overflow safety?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping?

Copy link
Copy Markdown
Contributor

@antoniovicente antoniovicente Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't realize that you had replied to some comments until now. I should have tagged the PR with waiting:any instead of waiting.

What if someone does not set config and ends up with the default concurrent_stream_limit of 0?
I did some tracing through the H2 implementation and found that initializeAndValidateOptions does translate a config value of 0 to 2147483647 as documented. It seems to me that concurrent_stream_limit should never be 0 when it reaches this constructor.

Still, connecting_capacity_ is an uint32_t. If there are more than 2 endpoints in the cluster, I think this will overflow when using the H2 default of 2**31-1 since I think these stats are per cluster, not per host/endpoint.

}

uint64_t currentUnusedCapacity() const {
return std::min(remaining_streams_, concurrent_stream_limit_ - numActiveStreams());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding: ASSERT(numActiveStreams() <= concurrent_stream_limit_);

parent, parent.host_->cluster().maxRequestsPerConnection(),
1 // HTTP1 always has a concurrent-request-limit of 1 per connection.
) {
codec_client_->setCodecClientCallbacks(*this);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, these callbacks were not set for H1 until now.

std::accumulate(per_priority_load.degraded_priority_load_.get().begin(),
per_priority_load.degraded_priority_load_.get().end(), 0));

total_healthy_hosts = 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function contains an early return on line 195; total_healthy_hosts is not initialized on early return.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Copy Markdown
Contributor Author

I think I've addressed the overflow concerns. PTAL?

@alyssawilk
Copy link
Copy Markdown
Contributor Author

ping :-)

antoniovicente
antoniovicente previously approved these changes Nov 16, 2020
value += delta;
}

void incrPendingStreams(uint32_t delta) { checkAndIncrement<uint32_t>(pending_streams_, delta); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: you can omit the template type on these template function calls and rely on the compiler doing type deduction.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level this LGTM.

I do have one question though: are we duplicating information that we already have in metrics? There are two parts to this question:

  1. On one hand, we are trying to not drive functionality through stats because it leads to bugs when stats get excluded.
  2. On the other hand we might be duplicating accounting logic and then not have visibility into certain metrics.

To account for (2) we have discussed potentially allowing stats to be marked as not excludable if we know they are used in critical logic. So my question is did you consider using stats for this and decide not to because of potential exclusion bugs? cc @jmarantz

/wait-any

@jmarantz
Copy link
Copy Markdown
Contributor

Having some stats be force-included independent of the disallow patterns makes sense to me. Is there an open bug for that?

@mattklein123
Copy link
Copy Markdown
Member

Having some stats be force-included independent of the disallow patterns makes sense to me. Is there an open bug for that?

I don't think it's tracked yet, but it does keep coming up and it seems like we might want that capability.

@alyssawilk
Copy link
Copy Markdown
Contributor Author

There may be overlap with stats, but I don't think we can use stats because prefetch data needs to be per-worker since the connections are only available locally (until/if we have cross-thread pools)

@mattklein123
Copy link
Copy Markdown
Member

There may be overlap with stats, but I don't think we can use stats because prefetch data needs to be per-worker since the connections are only available locally (until/if we have cross-thread pools)

OK that makes sense. I will take another quick pass today.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level looks good with some naming nits. This code is all tricky so IMO it would be better to split the accounting changes out from the other changes which seem unrelated, then we can review the integration separately? Up to you.

/wait

uint32_t active_streams_{};
// Tracks the stream capacity if all connecting connections were connected but
// excluding streams which are in use.
uint64_t connecting_capacity_{};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you name this connecting_stream_capacity_ or connection_stream_capacity_ and similar for the accessor methods? I was confused when I first started to look at the code below.

Comment on lines +109 to +110
// Tracks the stream capacity if all connecting connections were connected but
// excluding streams which are in use.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is a bit hard to parse. Can you clarify a bit? This is basically: total theoretical capacity - pending - active?


void ClusterManagerImpl::maybePrefetch(
ThreadLocalClusterManagerImpl::ClusterEntryPtr& cluster_entry,
ThreadLocalClusterManagerImpl::ClusterEntryPtr& cluster_entry, const ClusterConnectivityState&,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to split this part of the change out from the accounting work but up to you.

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with one other potential question/comment.

/wait-any

Comment on lines +158 to +159
// The total count of healthy hosts across all priority levels.
uint32_t total_healthy_hosts_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to revert all of this also as unrelated. If not reverted, does this need to be zero initialized somewhere?

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks.

@alyssawilk alyssawilk merged commit 8d62990 into envoyproxy:master Nov 18, 2020
andreyprezotto pushed a commit to andreyprezotto/envoy that referenced this pull request Nov 24, 2020
Tracking active, pending, and available capacity for each thread local cluster, for eventual use in cluster-wide prefetch

Risk Level: low
Testing: new unit tests
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
qqustc pushed a commit to qqustc/envoy that referenced this pull request Nov 24, 2020
Tracking active, pending, and available capacity for each thread local cluster, for eventual use in cluster-wide prefetch

Risk Level: low
Testing: new unit tests
Docs Changes: n/a
Release Notes: n/a

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Signed-off-by: Qin Qin <qqin@google.com>
@alyssawilk alyssawilk deleted the next_peekahead branch June 10, 2021 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants