-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the incorrect behavior you saw
If you do lots of small writes to a TLS transport, it uses a really significant amount of CPU and bandwidth compared to larger writes. The impact seems far more significant in TLS than it would be in TCP, e.g. TCP would combine the writes due to Nagle and therefore usually won't have 23× bandwidth increase!
Describe how to cause this behavior
See upstream pyOpenSSL issue for reproducer and numbers (10× CPU and 23× bandwidth): pyca/pyopenssl#1250
I discovered this in the wild in Foolscap, which writes many things one byte at a time... with code basically copy/pasted from Twisted: https://github.com/twisted/twisted/blob/trunk/src/twisted/spread/banana.py#L31 and later code.
Describe the correct behavior you'd like to see
Fixing this in pyOpenSSL is likely difficult; there is the upstream bug but it's not clear to me an extra buffering layer there makes sense. In contrast, twisted/protocols/tls.py already has a buffering layer, albeit for other use cases (wantread/wantwrite errors from OpenSSL).
The solution I'm imagining is having the TLS protocol register as a pull producer with the underlying TCP transport and aggregate small writes until it gets a pull message.
Another alternative is to just document this behavior, and fix upstream protocol implementations to be less chatty.
Testing environment
- Twisted checkout as of Sep 12, 2023
- Ubuntu 22.04