Optimise push in S3 driver #4066

milosgajdos · 2023-09-22T14:06:14Z

This PR attempts to optimise the performance of image push in the S3 driver. There are 2 main changes:

we refactor the S3 driver writer where instead of using simple bytes slices for ready and pending parts which get constantly appended data into them causing unnecessary allocations we use optimised bytes buffers; we make sure these are used efficiently
we introduce a memory pool that is used for allocating the byte buffers introduced above

These changes should alleviate high memory pressure on the push path to S3 especially when large images (layers) are pushed to the S3 store.

Related PR:

S3 driver: buffer pool #3798

milosgajdos · 2023-09-22T14:28:49Z

@corhere would you mind having a look at this WIP? You were on a review for #3798 and this PR builds on it, but more importantly hopefully improves it.

corhere

I feel like I'm missing something. What's the point of double-buffering if the flush operation is blocking? Is a small final chunk really that big of a deal that it's worth doubling memory usage to avoid it?

Why can't the write loop be as simple as this? Copy to buffer until full, flush buffer, repeat until there is nothing left to copy.

var n int
defer func() { w.size += int64(n) }()
reader := bytes.NewReader(p)
for reader.Len() > 0 {
    m, _ := w.buf.ReadFrom(reader)
    n += m
    if w.buf.Available() == 0 {
        if err := w.flushPart(); err != nil {
            return n, err
        }
    }
}
return n, nil

corhere · 2023-09-22T19:51:13Z

registry/storage/driver/s3-aws/s3.go

-	closed      bool
-	committed   bool
-	cancelled   bool
+	ctx       context.Context


Is passing a context into the S3 API methods necessary for optimizing push? Storing a context in a struct is discouraged as it obfuscates the lifetime of the operation that the context is being applied to. I am not totally against doing so (io.WriteCloser falls within the exception criteria) but I would much prefer to have that discussion in its own PR.

Yes, I am aware of this indeed, and yes it's sucky; the reason why the context is the writer field is that the writer makes a bunch of S3 API calls that require passing the context to them (unless we opt not to use the *WithContext() SDK methods at the expense of not being able to propagate the cancellations from the driver calls).

Would it be feasible to split the context plumbing improvements out to a follow-up PR? I have a couple things to discuss which would otherwise bog down the review of the push optimization work in this PR. (That reminds me, I really need to finish writing that blog post about contexts...)

I dont mind, happy to rip it out; as for the AWS SDK calls that accept this context as a param, do you prefer dropping the *WithContext() calls or temporarily passing context.TODO() to them with some comment?

I'd prefer reverting the calls so they don't show up as changed lines in the PR diff. It's easier to review that way.

Done in 0463ade

Curious about how you wanna go about this. I'm worried you'll suggest some wild context wrappers 🙃

I'm worried you'll suggest some wild context wrappers 🙃

Don't worry, I won't.

registry/storage/driver/s3-aws/s3.go

corhere · 2023-09-22T20:28:45Z

registry/storage/driver/s3-aws/s3.go

+// NOTE(milosgajdos): writer maintains its own context
+// passed to its constructor from storage driver.
+func (w *writer) Cancel(_ context.Context) error {


This is exactly the kind of thing I'm talking about regarding context lifetimes.

Yes, I commented above. Wish we could somehow link related comments other than like this: #4066 (comment)

registry/storage/driver/s3-aws/s3.go

corhere · 2023-09-22T20:54:06Z

registry/storage/driver/s3-aws/s3.go

 		parts:    parts,
 		size:     size,
+		ready:    d.NewBuffer(),
+		pending:  d.NewBuffer(),


Have you considered lazily allocating the pending buffer on demand? That'd cut down on memory consumption when writing files small enough to fit into a single chunk.

☝️ I see this was a "non-blocking" comment, but curious if you gave this some consideration.

I said in one of my comments that it's something I need to have a think about but would move it to the follow-up. It's a sound suggestion worth considering, indeed.

registry/storage/driver/s3-aws/s3.go

milosgajdos · 2023-09-24T21:56:24Z

I've updated the writer implementation on all the awesome feedback @corhere Ngl, felt mildly embarrassed to misread some of your comments 🤦‍♂️ Thanks for the commit suggestions that nudged me to re-read them 😄

I feel like I'm missing something. What's the point of double-buffering if the flush operation is blocking? Is a small final chunk really that big of a deal that it's worth doubling memory usage to avoid it?

This is a billion-dollar question that we may want to address in the future 😄 I have done a bit of git archaeology attempting to find some reasoning in the PR comments/git commits but couldn't find anything.

There is one more comment I need to think about a bit: lazy allocation of the pending buffer; maybe we can address that one in the follow-up, too.

corhere

Another mystery: why is the storage driver not simply implemented in terms of https://pkg.go.dev/github.com/aws/aws-sdk-go@v1.45.16/service/s3/s3manager#Uploader?

registry/storage/driver/s3-aws/s3.go

corhere · 2023-09-26T16:12:44Z

registry/storage/driver/s3-aws/s3.go

-	closed      bool
-	committed   bool
-	cancelled   bool
+	ctx       context.Context


Would it be feasible to split the context plumbing improvements out to a follow-up PR? I have a couple things to discuss which would otherwise bog down the review of the push optimization work in this PR. (That reminds me, I really need to finish writing that blog post about contexts...)

registry/storage/driver/s3-aws/s3.go

corhere

LGTM! I'll approve once you've squashed the PR. Don't worry about retaining my authorship credit on the suggestions.

I left a couple of optional suggestions for further improving the code. Emphasis on optional.

registry/storage/driver/s3-aws/s3.go

corhere · 2023-09-27T19:01:03Z

registry/storage/driver/s3-aws/s3.go

-	closed      bool
-	committed   bool
-	cancelled   bool
+	ctx       context.Context


I'm worried you'll suggest some wild context wrappers 🙃

Don't worry, I won't.

registry/storage/driver/s3-aws/s3.go

This commit cleans up and attempts to optimise the performance of image push in S3 driver. There are 2 main changes: * we refactor the S3 driver Writer where instead of using separate bytes slices for ready and pending parts which get constantly appended data into them causing unnecessary allocations we use optimised bytes buffers; we make sure these are used efficiently when written to. * we introduce a memory pool that is used for allocating the byte buffers introduced above These changes should alleviate high memory pressure on the push path to S3. Co-authored-by: Cory Snider <corhere@gmail.com> Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>

milosgajdos · 2023-09-27T20:38:08Z

Alright, squashed, PTAL @corhere @thaJeztah @Jamstah

corhere

LGTM! I see a couple of minor formatting nits, but I am going to pretend I didn't notice them.

thaJeztah

Gave this a first glance, and looks great; I left some questions / comments (none really blocking, afaics)

thaJeztah · 2023-09-28T10:55:16Z

registry/storage/driver/s3-aws/s3.go

+		err = nil
+	}
+	return offset, err


If err is not nil (and not a io.EOF), should offset still be returned, or should it return 0, err ?

If so, then perhaps something like this would be slightly more idiomatic;l

if err != nil && err != io.EOF { return 0, err } return offset, nil

Returning 0 on error would violate the io.ReaderFrom interface contract.

thaJeztah · 2023-09-28T10:58:06Z

registry/storage/driver/s3-aws/s3.go

+	for len(b.data) < cap(b.data) && err == nil {
+		var n int
+		n, err = r.Read(b.data[len(b.data):cap(b.data)])
+		offset += int64(n)
+		b.data = b.data[:len(b.data)+n]


Just a question; is it worth benchmarking if memoizing len(b.data) and cap(b.data) here would be beneficial (i.e., use an intermediate variable to store those as we're calling it multiple times). Maybe not; just curious.

Yeah, I'm not sure it's worth it. Maybe? 🤷‍♂️ len and cap just return an unexported field from the underlying struct so I'd expect negligible gain here 🤔

No; len(b.data) and cap(b.data) are just accessing fields on the slice header which costs the same as any other struct field read. Compiler explorer reveals that the compiler is able to elide bounds-checks for the expression b.data[len(b.data):cap(b.data)] which it cannot do when those values are copied to local variables. Copying the slice itself to a local variable sounds promising as a way to avoid dereferencing pointers inside the loop, but as compiler explorer shows, the generated AMD64 code immediately spills the slice header fields to the stack.

tl;dr introducing intermediate variables will actually deoptimize the function. Compilers are pretty smart these days.

yes, sorry by "unexported fields" I meant "fields" and yeah, header is the right terminology. Thanks

thaJeztah · 2023-09-28T11:13:02Z

registry/storage/driver/s3-aws/s3.go

 		parts:    parts,
 		size:     size,
+		ready:    d.NewBuffer(),
+		pending:  d.NewBuffer(),


☝️ I see this was a "non-blocking" comment, but curious if you gave this some consideration.

registry/storage/driver/s3-aws/s3.go

thaJeztah · 2023-09-28T11:32:51Z

registry/storage/driver/s3-aws/s3.go

-		w.pendingPart = nil
+
+	buf := bytes.NewBuffer(w.ready.data)
+	if w.driver.MultipartCombineSmallPart && (w.pending.Len() > 0 && w.pending.Len() < int(w.driver.ChunkSize)) {


nit: looks like the extra braces are not needed here

if w.driver.MultipartCombineSmallPart && w.pending.Len() > 0 && w.pending.Len() < int(w.driver.ChunkSize) {

they are not needed, no, but I felt like it'd be nice of these were semantically grouped -- it would have been more obvious if I split these on two separate lines; I'm happy to change back I just feel there are too many &&s in a single condition causing cognitive overload 🤷‍♂️

thaJeztah · 2023-09-28T11:39:13Z

registry/storage/driver/s3-aws/s3.go

-				p = nil
+		// try filling up the pending parts buffer
+		offset, err = w.pending.ReadFrom(reader)
+		n += int(offset)


As we're calling this in a for; wondering if it would be beneficial to change n to an int64, and only cast to an int when returning

I considered this, indeed. I stuck with this at the time. There are a few returns here so I struggle to see the win 🤔 I'm willing to be persuaded though

Yeah, I was curious if benchmarking would show a difference; I just did a quick check, and there's no consistent difference, so not worth changing.

thaJeztah · 2023-09-28T11:41:34Z

registry/storage/driver/s3-aws/s3.go

+		offset, err = w.pending.ReadFrom(reader)
+		n += int(offset)
+		if err != nil {
+			return n, err


I'd have to dig deeper into the whole flow, but is it intentional to return both n and an err here? (The code higher up at line 1418 does a return 0, err on errors.

I think this is an interesting shout. writer is supposed to implement https://pkg.go.dev/io#Writer

It returns the number of bytes written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p)

Now, the code on line 1418 does NOT write to p which is the byte slice passed into Write. I think it's fine to return 0, err there and n, err here. WDYT @corhere

Returning the number of bytes written from p is a requirement of the io.Writer interface contract. Zero bytes have been written from p if control flow reaches line 1418.

Thanks! That's the part I wanted to dig deeper into, because I suspected that information had to be preserved (even in the error case). I guess the interface is just slightly confusing to require both error and value, but it makes sense in context.

thaJeztah · 2023-09-28T11:43:52Z

registry/storage/driver/s3-aws/s3.go

+	reader := bytes.NewReader(p)
+
+	for reader.Len() > 0 {


I guess reader := bytes.NewReader(p) could be inlined here, but that's really nitpicking.

Yeah, we could probably be more clever in this code but let's keep it as unless you insist 😄

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>

milosgajdos · 2023-09-29T06:53:46Z

PTAL @thaJeztah. All the important points have been addressed.

thaJeztah

LGTM

thaJeztah · 2023-09-29T12:01:47Z

registry/storage/driver/s3-aws/s3.go

-				p = nil
+		// try filling up the pending parts buffer
+		offset, err = w.pending.ReadFrom(reader)
+		n += int(offset)


Yeah, I was curious if benchmarking would show a difference; I just did a quick check, and there's no consistent difference, so not worth changing.

thaJeztah · 2023-09-29T12:04:14Z

registry/storage/driver/s3-aws/s3.go

+		offset, err = w.pending.ReadFrom(reader)
+		n += int(offset)
+		if err != nil {
+			return n, err


Thanks! That's the part I wanted to dig deeper into, because I suspected that information had to be preserved (even in the error case). I guess the interface is just slightly confusing to require both error and value, but it makes sense in context.

milosgajdos force-pushed the optimise-s3-push branch from bf3d505 to e49e3a1 Compare September 22, 2023 14:07

milosgajdos added area/storage/s3 enhancement labels Sep 22, 2023

milosgajdos requested review from Jamstah and corhere September 22, 2023 14:28

corhere reviewed Sep 22, 2023

View reviewed changes

corhere reviewed Sep 24, 2023

View reviewed changes

registry/storage/driver/s3-aws/s3.go Outdated Show resolved Hide resolved

milosgajdos force-pushed the optimise-s3-push branch from 2f7ce80 to 1fa0272 Compare September 24, 2023 21:48

milosgajdos requested a review from corhere September 24, 2023 22:00

milosgajdos force-pushed the optimise-s3-push branch from acc7e9f to 29c98b3 Compare September 25, 2023 19:22

milosgajdos marked this pull request as ready for review September 25, 2023 21:15

milosgajdos requested review from deleteriousEffect and thaJeztah September 25, 2023 21:25

corhere requested changes Sep 26, 2023

View reviewed changes

milosgajdos requested a review from corhere September 27, 2023 16:40

corhere reviewed Sep 27, 2023

View reviewed changes

registry/storage/driver/s3-aws/s3.go Outdated Show resolved Hide resolved

milosgajdos force-pushed the optimise-s3-push branch from 1f93200 to b888b14 Compare September 27, 2023 20:37

milosgajdos requested a review from corhere September 27, 2023 20:38

corhere approved these changes Sep 27, 2023

View reviewed changes

thaJeztah reviewed Sep 28, 2023

View reviewed changes

Move completedParts type back to the original position

4fce3c0

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>

milosgajdos requested a review from thaJeztah September 28, 2023 15:52

thaJeztah approved these changes Sep 29, 2023

View reviewed changes

milosgajdos merged commit 735c161 into distribution:main Sep 29, 2023

milosgajdos deleted the optimise-s3-push branch September 29, 2023 12:47

This was referenced Oct 1, 2023

S3 driver: buffer pool #3798

Closed

Fix S3 storage driver to support R2 Multipart upload #3940

Closed

corhere mentioned this pull request Oct 17, 2023

Replace docker/libtrust with go-jose/go-jose #4096

Merged

milosgajdos mentioned this pull request Oct 18, 2023

Add context to storagedriver.(Filewriter).Commit() #4109

Merged

corhere mentioned this pull request Dec 8, 2023

testing: replace legacy gopkg.in/check.v1 #4185

Merged

ianseyer mentioned this pull request Jul 30, 2024

Multipart upload issues with Cloudflare R2 #3873

Closed

Optimise push in S3 driver #4066

Optimise push in S3 driver #4066

Uh oh!

Conversation

milosgajdos commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milosgajdos commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corhere left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milosgajdos Sep 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

milosgajdos commented Sep 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corhere left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

corhere left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

milosgajdos commented Sep 27, 2023

Uh oh!

corhere left a comment

Choose a reason for hiding this comment

Uh oh!

thaJeztah left a comment

Choose a reason for hiding this comment

milosgajdos commented Sep 22, 2023 •

edited

Loading

milosgajdos commented Sep 22, 2023 •

edited

Loading

milosgajdos Sep 24, 2023 •

edited

Loading

milosgajdos commented Sep 24, 2023 •

edited

Loading

milosgajdos Sep 28, 2023 •

edited

Loading

milosgajdos Sep 28, 2023 •

edited

Loading