Skip to content

Reduce simple HTTP/2 post app allocation by ~40%#32557

Merged
stephentoub merged 12 commits intodotnet:masterfrom
stephentoub:httpalloc2
Feb 20, 2020
Merged

Reduce simple HTTP/2 post app allocation by ~40%#32557
stephentoub merged 12 commits intodotnet:masterfrom
stephentoub:httpalloc2

Conversation

@stephentoub
Copy link
Member

Comparing master (which includes #32406) to this PR, running this simple app:

using System;
using System.IO;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;

class Program
{
    static async Task Main()
    {
        var client = new HttpMessageInvoker(new SocketsHttpHandler() { UseCookies = false, AllowAutoRedirect = false });
        var request = new HttpRequestMessage(HttpMethod.Post, "https://httpbin.org/post") { Version = HttpVersion.Version20, Content = new ByteArrayContent(new byte[100_000]) };

        for (int i = 0; i < 1000; i++)
        {
            using (HttpResponseMessage r = await client.SendAsync(request, default))
            using (Stream s = await r.Content.ReadAsStreamAsync())
            {
                await s.CopyToAsync(Stream.Null);
            }
        }
    }
}

reduces number of allocations by 39% and total size of allocations by 37%.

image

A good chunk of this is specific to HTTP/2, but one of the changes is in SslStream.ReadAsync and thus accrues to other use of SslStream, HttpClient and otherwise.

cc: @scalablecory, @davidsh, @wfurt, @JamesNK

We don't need to allocate a linked token source in SendRequestBodyAsync if the caller's token is the default
We're carrying around an extra 24-bytes for a `ReadOnlyMemory<byte>`, when we could instead just use the argument.
If `value` happens to be empty, this will avoid an allocation.  But what actually led me to do this was just tightening up the code.
The current structure is that ReadAsync makes two calls to FillBufferAsync, one to ensure the frame header is read and another to ensure any additional payload is read.  This has two issues:
1. It ensures that in addition to allocating a state machine for FillBufferAsync (or, rather, a helper it uses) when it needs to yield, it'll also end up allocating for ReadAsync.
2. It complicates error handling, which needs to differentiate whether the first read can't get any bytes or whether a subsequent read can't, which necessitates storing state like how many bytes we initially had buffered so we can compare to that to see if we need to throw.

We can instead:
- Make FillBufferAsync into a simple "read until we get the requested number of bytes" loop and throw if it fails to do so.
- Do the initial read in ReadAsync, thereby allowing us to special-case the first read for both error handling and to minimize the chances that the helper call needs to yield.

This eliminates a bunch of FillBufferAsync state machines and also decreases the size of the state machines when they are needed.
This has a variety of benefits:
- We no longer need to allocate a `Queue<Waiter>` and its underlying `Waiter[]`.
- We no longer need to allocate a `TaskCompletionSource<int>` and its `Task<int>`, instead creating a single `IValueTaskSource<T>` implementation.
- For non-cancelable waiters, we can specialize to not need to carry around a meaningless CancellationToken field.
- For cancelable waiters (the common case), we can avoid an entire async method and its state machine by just storing the relevant state onto the waiter itself.
It's not that much more code to just manually inline EnsureIncomingBytesAsync into the three places it's used, and doing so has multiple benefits, both for size and for error messages.
@stephentoub stephentoub requested a review from a team February 19, 2020 20:28
@davidsh davidsh added this to the 5.0 milestone Feb 19, 2020
stephentoub and others added 2 commits February 19, 2020 20:13
…andler/Http2Connection.cs

Co-Authored-By: Cory Nelson <phrosty@gmail.com>
@stephentoub stephentoub merged commit e9853d4 into dotnet:master Feb 20, 2020
@stephentoub stephentoub deleted the httpalloc2 branch February 20, 2020 14:28
@stephentoub
Copy link
Member Author

@JamesNK, I'm curious to see if this and my previous PRs move the needle at all on your benchmarks. If they do, I can put in some more time to go further. If they don't, it's an indication that additional effort won't be worthwhile. Can you help me with that? What would you need?

@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants