Skip to content

[chttp2] Alternative protection for too many streams in the system#34697

Merged
ctiller merged 4 commits intogrpc:masterfrom
ctiller:stream-count
Oct 19, 2023
Merged

[chttp2] Alternative protection for too many streams in the system#34697
ctiller merged 4 commits intogrpc:masterfrom
ctiller:stream-count

Conversation

@ctiller
Copy link
Copy Markdown
Member

@ctiller ctiller commented Oct 15, 2023

Cancel streams if we have too much concurrency on a single channel to allow that server to recover.

There seems to be a convergence in the HTTP2 community about this being a reasonable thing to do, so I'd like to try it in some real scenarios.

If this pans out well then I'll likely drop the red_max_concurrent_streams and the rstpit experiments in preference to this.

I'm also considering tying in resource quota so that under high memory pressure we just default to this path.

@ctiller ctiller requested review from yashykt and removed request for gnossen, jtattermusch and veblush October 15, 2023 15:59
@ctiller ctiller added the release notes: yes Indicates if PR needs to be in release notes label Oct 15, 2023
/// timeout
grpc_core::Duration keepalive_timeout;
/// number of stream objects currently allocated by this transport
std::atomic<size_t> streams_allocated{0};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be atomic?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

init_stream (which uniquely calls the constructor for grpc_chttp2_stream) is called outside the combiner;

To keep this code simple I don't want to reason about which path we're taking to get to the constructor (since it's weirdly different between client & server, etc...) - doing so would only lead to bugs (we don't want a security problem because we accidentally decremented too many times).

Other places we use similar atomics (transport level):

  • every transport op
  • every time a stream is created or destroyed
  • every time a stream is removed
  • every ping
  • every read
  • every write

Also at the stream level:

  • every time a stream becomes writable
  • every stream op (at least 2x)

This is not including any of the refcounting that happens external to the transport.

The atomic is not a concern.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you point me to these atomics? I don't spot them immediately.

I still think we shouldn't be using atomics unnecessarily, but given that the transport is going through overhaul, I'm ok with this for now.

@ctiller ctiller merged commit e81d181 into grpc:master Oct 19, 2023
@copybara-service copybara-service bot added the imported Specifies if the PR has been imported to the internal repository label Oct 19, 2023
ctiller added a commit to ctiller/grpc that referenced this pull request Oct 20, 2023
…rpc#34697)

Cancel streams if we have too much concurrency on a single channel to
allow that server to recover.

There seems to be a convergence in the HTTP2 community about this being
a reasonable thing to do, so I'd like to try it in some real scenarios.

If this pans out well then I'll likely drop the
`red_max_concurrent_streams` and the `rstpit` experiments in preference
to this.

I'm also considering tying in resource quota so that under high memory
pressure we just default to this path.

---------

Co-authored-by: ctiller <ctiller@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bloat/none imported Specifies if the PR has been imported to the internal repository lang/core per-call-memory/neutral per-channel-memory/neutral release notes: yes Indicates if PR needs to be in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants