Fix compressed read retries in compressed blobs.#273
Conversation
5f84729 to
03972bb
Compare
|
In the process, I've removed the retrier based on a wrong digest, since this pretty much should have been our flake reason... Wdyt? |
| // have to go somewhere or they'll block execution. | ||
| io.Copy(ioutil.Discard, r) | ||
| } | ||
| w.Close() |
There was a problem hiding this comment.
Where is this Close happening now?
Btw, I forget: did we add the Windows tests to presubmit? I'll patch this commit in and test it on Windows just in case. Also, please run a test with features=race, just in case.
There was a problem hiding this comment.
Under if d.Size == sz { in the new file... if err := wt.Close().
And I thought the CI was doing features=race already, sighs. Caught a big, thanks for reminding me!
And i don't think we have windows CI.
| t.Errorf("client.ReadBlob(ctx, digest) gave diff (-want, +got):\n%s", diff) | ||
| } | ||
| }) | ||
| for _, comp := range []bool{false, true} { |
There was a problem hiding this comment.
How much longer did it make the test? If it added a few seconds, then suggestion: make compression retry test a separate Test function from this one that includes the sleeping, because these two are really orthogonal. Ditto for files.
There was a problem hiding this comment.
The non race tests still take < 10s (less than 1s of noticeable increase).
I'm not sure they're orthogonal - I'd argue they should be the same. As much as possible, comp vs noncomp should behave the same way.
There was a problem hiding this comment.
Sure, if it doesn't make it noticeably longer, this is fine!
| if limit > 0 && limit < sz { | ||
| sz = limit | ||
| } | ||
| if err = w.Reset(); err != nil { |
There was a problem hiding this comment.
Wait, if you moved this Reset outside of the closure, that tells me you no longer need it to be resettable. And you're removing the functionality of retrying on failed digest verification, too.
So, suggestion: base this PR on a revert of my #271. Should be much simpler, imo.
There was a problem hiding this comment.
I mention that I getting rid of the retries in a comment. I think they were useful only because of this. I was wondering wdyt?
Maybe it'd be much simpler to rebase this on a revert, but I've already done this path :p.
There was a problem hiding this comment.
We'll have to clean this up in the fixit next week, then. There's a lot of stuff here now that no longer should be that will just be confusing. We no longer need the whole ResettableWriter struct and interface, and the refactoring I did that made everything more complicated...
But I'm okay with leaving it as a cleanup task for the next CL.
The current configuration will try to use the offset of the _compressed_ blob as the new offset for retrying reads. However, the compressed RE API wants the offset to refer to the _uncompressed_ blobs. Since the compressed offset will usually be a lower number than the uncompressed offset, we generally end up with a bigger blob than expected. Since this Read interruptions are mostly transient (eg context deadlines exceeded), our higher level retries have been mostly hiding the issue. Also took the chance to beef up some comments around the thread tricky compression code. I'll _eventually_ add retries tests to compressed blobs writes retry tests :)
| if limit > 0 && limit < sz { | ||
| sz = limit | ||
| } | ||
| if err = w.Reset(); err != nil { |
There was a problem hiding this comment.
We'll have to clean this up in the fixit next week, then. There's a lot of stuff here now that no longer should be that will just be confusing. We no longer need the whole ResettableWriter struct and interface, and the refactoring I did that made everything more complicated...
But I'm okay with leaving it as a cleanup task for the next CL.
| t.Errorf("client.ReadBlob(ctx, digest) gave diff (-want, +got):\n%s", diff) | ||
| } | ||
| }) | ||
| for _, comp := range []bool{false, true} { |
There was a problem hiding this comment.
Sure, if it doesn't make it noticeably longer, this is fine!
The current configuration will try to use the offset of the _compressed_ blob as the new offset for retrying reads. However, the compressed RE API wants the offset to refer to the _uncompressed_ blobs. Since the compressed offset will usually be a lower number than the uncompressed offset, we generally end up with a bigger blob than expected. Since this Read interruptions are mostly transient (eg context deadlines exceeded), our higher level retries have been mostly hiding the issue. Also took the chance to beef up some comments around the thread tricky compression code. I'll _eventually_ add retries tests to compressed blobs writes retry tests :)
The current configuration will try to use the offset of the compressed
blob as the new offset for retrying reads. However, the compressed RE
API wants the offset to refer to the uncompressed blobs. Since the
compressed offset will usually be a lower number than the uncompressed
offset, we generally end up with a bigger blob than expected.
Since this Read interruptions are mostly transient (eg context deadlines
exceeded), our higher level retries have been mostly hiding the issue.
Also took the chance to beef up some comments around the thread tricky
compression code.
I'll eventually add retries tests to compressed blobs writes retry tests :)