-
Notifications
You must be signed in to change notification settings - Fork 236
Description
Following on from this question: https://stackoverflow.com/questions/63286981/gcp-pubsub-why-does-publishing-200k-messages-rapidly-result-in-2-5-million-mess
If I've understood the issue correctly, basically when we throw lots of messages at the client/sdk rapidly it correctly splits them into batches but does not limit the number of simultaneous gRPC publish requests that are made.
These requests are (as far as I can tell) going over a single http2 connection, which in nodejs by default is limited to 100 concurrent streams
peerMaxConcurrentStreams Sets the maximum number of concurrent streams for the remote peer as if a SETTINGS frame had been received. Will be overridden if the remote peer sets its own value for maxConcurrentStreams. Default: 100.
ref: https://nodejs.org/api/http2.html
As far as I can tell in the instance of pubsub we are either:
- Taking this default of 100
- Reducing the value to 1/2 (based on
rg -uu max_concurrent_streams) - Or accepting a value set by the server side unknown to me
I believe that by issuing so many simultaneous requests that they get gummed up, exceed the deadline (or some other condition) and are automatically retried as per the retry behaviour specified here: https://github.com/googleapis/nodejs-pubsub/blob/master/src/v1/publisher_client_config.json#L13-L21
I'm not very familiar with the code base, especially the google-gax / grpc-js parts so I'm hoping someone with deeper knowledge can confirm this theory.
I think the problem can be solved by doing something along the lines of this:
https://github.com/googleapis/nodejs-pubsub/compare/master...mnahkies:mn/limit-num-of-inflight-publish-requests?expand=1
I haven't yet tested that solution, but plan to test this and if it fits in with the project direction can refine it and write unit tests etc