Skip to content

Conversation

@milosgajdos
Copy link
Member

@milosgajdos milosgajdos commented Sep 22, 2023

This PR attempts to optimise the performance of image push in the S3 driver. There are 2 main changes:

  • we refactor the S3 driver writer where instead of using simple bytes slices for ready and pending parts which get constantly appended data into them causing unnecessary allocations we use optimised bytes buffers; we make sure these are used efficiently
  • we introduce a memory pool that is used for allocating the byte buffers introduced above

These changes should alleviate high memory pressure on the push path to S3 especially when large images (layers) are pushed to the S3 store.

Related PR:

@milosgajdos
Copy link
Member Author

milosgajdos commented Sep 22, 2023

@corhere would you mind having a look at this WIP? You were on a review for #3798 and this PR builds on it, but more importantly hopefully improves it.

Copy link
Collaborator

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm missing something. What's the point of double-buffering if the flush operation is blocking? Is a small final chunk really that big of a deal that it's worth doubling memory usage to avoid it?

Why can't the write loop be as simple as this? Copy to buffer until full, flush buffer, repeat until there is nothing left to copy.

var n int
defer func() { w.size += int64(n) }()
reader := bytes.NewReader(p)
for reader.Len() > 0 {
    m, _ := w.buf.ReadFrom(reader)
    n += m
    if w.buf.Available() == 0 {
        if err := w.flushPart(); err != nil {
            return n, err
        }
    }
}
return n, nil

closed bool
committed bool
cancelled bool
ctx context.Context
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is passing a context into the S3 API methods necessary for optimizing push? Storing a context in a struct is discouraged as it obfuscates the lifetime of the operation that the context is being applied to. I am not totally against doing so (io.WriteCloser falls within the exception criteria) but I would much prefer to have that discussion in its own PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am aware of this indeed, and yes it's sucky; the reason why the context is the writer field is that the writer makes a bunch of S3 API calls that require passing the context to them (unless we opt not to use the *WithContext() SDK methods at the expense of not being able to propagate the cancellations from the driver calls).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible to split the context plumbing improvements out to a follow-up PR? I have a couple things to discuss which would otherwise bog down the review of the push optimization work in this PR. (That reminds me, I really need to finish writing that blog post about contexts...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont mind, happy to rip it out; as for the AWS SDK calls that accept this context as a param, do you prefer dropping the *WithContext() calls or temporarily passing context.TODO() to them with some comment?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer reverting the calls so they don't show up as changed lines in the PR diff. It's easier to review that way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 0463ade

Curious about how you wanna go about this. I'm worried you'll suggest some wild context wrappers 🙃

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried you'll suggest some wild context wrappers 🙃

Don't worry, I won't.

Comment on lines 1498 to 1500
// NOTE(milosgajdos): writer maintains its own context
// passed to its constructor from storage driver.
func (w *writer) Cancel(_ context.Context) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the kind of thing I'm talking about regarding context lifetimes.

Copy link
Member Author

@milosgajdos milosgajdos Sep 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I commented above. Wish we could somehow link related comments other than like this: #4066 (comment)

parts: parts,
size: size,
ready: d.NewBuffer(),
pending: d.NewBuffer(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered lazily allocating the pending buffer on demand? That'd cut down on memory consumption when writing files small enough to fit into a single chunk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ I see this was a "non-blocking" comment, but curious if you gave this some consideration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said in one of my comments that it's something I need to have a think about but would move it to the follow-up. It's a sound suggestion worth considering, indeed.

@milosgajdos
Copy link
Member Author

milosgajdos commented Sep 24, 2023

I've updated the writer implementation on all the awesome feedback @corhere Ngl, felt mildly embarrassed to misread some of your comments 🤦‍♂️ Thanks for the commit suggestions that nudged me to re-read them 😄

I feel like I'm missing something. What's the point of double-buffering if the flush operation is blocking? Is a small final chunk really that big of a deal that it's worth doubling memory usage to avoid it?

This is a billion-dollar question that we may want to address in the future 😄 I have done a bit of git archaeology attempting to find some reasoning in the PR comments/git commits but couldn't find anything.

There is one more comment I need to think about a bit: lazy allocation of the pending buffer; maybe we can address that one in the follow-up, too.

@milosgajdos milosgajdos requested a review from corhere September 24, 2023 22:00
@milosgajdos milosgajdos marked this pull request as ready for review September 25, 2023 21:15
Copy link
Collaborator

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another mystery: why is the storage driver not simply implemented in terms of https://pkg.go.dev/github.com/aws/aws-sdk-go@v1.45.16/service/s3/s3manager#Uploader?

closed bool
committed bool
cancelled bool
ctx context.Context
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible to split the context plumbing improvements out to a follow-up PR? I have a couple things to discuss which would otherwise bog down the review of the push optimization work in this PR. (That reminds me, I really need to finish writing that blog post about contexts...)

@milosgajdos milosgajdos requested a review from corhere September 27, 2023 16:40
Copy link
Collaborator

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I'll approve once you've squashed the PR. Don't worry about retaining my authorship credit on the suggestions.

I left a couple of optional suggestions for further improving the code. Emphasis on optional.

closed bool
committed bool
cancelled bool
ctx context.Context
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried you'll suggest some wild context wrappers 🙃

Don't worry, I won't.

This commit cleans up and attempts to optimise the performance of image push in S3 driver.
There are 2 main changes:
* we refactor the S3 driver Writer where instead of using separate bytes
  slices for ready and pending parts which get constantly appended data
  into them causing unnecessary allocations we use optimised bytes
  buffers; we make sure these are used efficiently when written to.
* we introduce a memory pool that is used for allocating the byte
  buffers introduced above

These changes should alleviate high memory pressure on the push path to S3.

Co-authored-by: Cory Snider <corhere@gmail.com>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
@milosgajdos
Copy link
Member Author

Alright, squashed, PTAL @corhere @thaJeztah @Jamstah

@milosgajdos milosgajdos requested a review from corhere September 27, 2023 20:38
Copy link
Collaborator

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I see a couple of minor formatting nits, but I am going to pretend I didn't notice them.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave this a first glance, and looks great; I left some questions / comments (none really blocking, afaics)

Comment on lines +1293 to +1295
err = nil
}
return offset, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If err is not nil (and not a io.EOF), should offset still be returned, or should it return 0, err ?

If so, then perhaps something like this would be slightly more idiomatic;l

if err != nil && err != io.EOF {
    return 0, err
}
return offset, nil

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +1284 to +1288
for len(b.data) < cap(b.data) && err == nil {
var n int
n, err = r.Read(b.data[len(b.data):cap(b.data)])
offset += int64(n)
b.data = b.data[:len(b.data)+n]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question; is it worth benchmarking if memoizing len(b.data) and cap(b.data) here would be beneficial (i.e., use an intermediate variable to store those as we're calling it multiple times). Maybe not; just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm not sure it's worth it. Maybe? 🤷‍♂️ len and cap just return an unexported field from the underlying struct so I'd expect negligible gain here 🤔

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; len(b.data) and cap(b.data) are just accessing fields on the slice header which costs the same as any other struct field read. Compiler explorer reveals that the compiler is able to elide bounds-checks for the expression b.data[len(b.data):cap(b.data)] which it cannot do when those values are copied to local variables. Copying the slice itself to a local variable sounds promising as a way to avoid dereferencing pointers inside the loop, but as compiler explorer shows, the generated AMD64 code immediately spills the slice header fields to the stack.

tl;dr introducing intermediate variables will actually deoptimize the function. Compilers are pretty smart these days.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sorry by "unexported fields" I meant "fields" and yeah, header is the right terminology. Thanks

parts: parts,
size: size,
ready: d.NewBuffer(),
pending: d.NewBuffer(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

☝️ I see this was a "non-blocking" comment, but curious if you gave this some consideration.

w.pendingPart = nil

buf := bytes.NewBuffer(w.ready.data)
if w.driver.MultipartCombineSmallPart && (w.pending.Len() > 0 && w.pending.Len() < int(w.driver.ChunkSize)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like the extra braces are not needed here

if w.driver.MultipartCombineSmallPart && w.pending.Len() > 0 && w.pending.Len() < int(w.driver.ChunkSize) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are not needed, no, but I felt like it'd be nice of these were semantically grouped -- it would have been more obvious if I split these on two separate lines; I'm happy to change back I just feel there are too many &&s in a single condition causing cognitive overload 🤷‍♂️

p = nil
// try filling up the pending parts buffer
offset, err = w.pending.ReadFrom(reader)
n += int(offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we're calling this in a for; wondering if it would be beneficial to change n to an int64, and only cast to an int when returning

Copy link
Member Author

@milosgajdos milosgajdos Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this, indeed. I stuck with this at the time. There are a few returns here so I struggle to see the win 🤔 I'm willing to be persuaded though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was curious if benchmarking would show a difference; I just did a quick check, and there's no consistent difference, so not worth changing.

offset, err = w.pending.ReadFrom(reader)
n += int(offset)
if err != nil {
return n, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to dig deeper into the whole flow, but is it intentional to return both n and an err here? (The code higher up at line 1418 does a return 0, err on errors.

Copy link
Member Author

@milosgajdos milosgajdos Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an interesting shout. writer is supposed to implement https://pkg.go.dev/io#Writer

It returns the number of bytes written from p (0 <= n <= len(p)) and any error encountered that caused the write to stop early. Write must return a non-nil error if it returns n < len(p)

Now, the code on line 1418 does NOT write to p which is the byte slice passed into Write. I think it's fine to return 0, err there and n, err here. WDYT @corhere

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning the number of bytes written from p is a requirement of the io.Writer interface contract. Zero bytes have been written from p if control flow reaches line 1418.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That's the part I wanted to dig deeper into, because I suspected that information had to be preserved (even in the error case). I guess the interface is just slightly confusing to require both error and value, but it makes sense in context.

Comment on lines +1449 to +1451
reader := bytes.NewReader(p)

for reader.Len() > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess reader := bytes.NewReader(p) could be inlined here, but that's really nitpicking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could probably be more clever in this code but let's keep it as unless you insist 😄

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
@milosgajdos
Copy link
Member Author

PTAL @thaJeztah. All the important points have been addressed.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

p = nil
// try filling up the pending parts buffer
offset, err = w.pending.ReadFrom(reader)
n += int(offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was curious if benchmarking would show a difference; I just did a quick check, and there's no consistent difference, so not worth changing.

offset, err = w.pending.ReadFrom(reader)
n += int(offset)
if err != nil {
return n, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! That's the part I wanted to dig deeper into, because I suspected that information had to be preserved (even in the error case). I guess the interface is just slightly confusing to require both error and value, but it makes sense in context.

@milosgajdos milosgajdos merged commit 735c161 into distribution:main Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants