Fix testReadBlobWithReadTimeouts retries count#139999
Conversation
|
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
| final int meaningfulProgressSize = (int) (bufferSize.getBytes() / 100L); | ||
| final byte[] bytesPerRetry = randomByteArrayOfLength(meaningfulProgressSize / maxRetries); |
There was a problem hiding this comment.
I don't understand the computation here. We need each attempt to deliver strictly fewer than bufferSize.getBytes() / 100L bytes to avoid it being considered "meaningful" but why also divide by maxRetries? Does the GCS SDK retry attempt to resume the download internally up to maxRetries times? If so, that's different from the S3 SDK behaviour, and I'd prefer we made that explicit.
Also if maxRetries == 1 then sometimes this'll send exactly bufferSize.getBytes() / 100L which is large enough to be meaningful. It needs to be strictly less than this size.
Also, rather than hard-coding the meaningful size to be bufferSize.getBytes() / 100L let's delegate the calculation to the repository somehow. Note that GCS repositories don't allow any control over the buffer size, it's always GoogleCloudStorageBlobStore.SDK_DEFAULT_CHUNK_SIZE i.e. 16MiB regardless of our choice of bufferSize above.
There was a problem hiding this comment.
I don't understand the computation here. We need each attempt to deliver strictly fewer than bufferSize.getBytes() / 100L bytes to avoid it being considered "meaningful" but why also divide by maxRetries?
I misunderstood currentStreamProgress, thanks for pointing out. Now I see that we compare offset of the RetryingInputStream with current try offset. I missed part where we update current offset during stream read.
private long currentStreamProgress() {
if (currentStream == null) {
return 0L;
}
return Math.subtractExact(Math.addExact(start, offset), currentStream.getFirstOffset());
}Does the GCS SDK retry attempt to resume the download internally up to maxRetries times?
It does not. GCS should be the same as S3 right now.
Also if maxRetries == 1 then sometimes this'll send exactly bufferSize.getBytes() / 100L which is large enough to be meaningful. It needs to be strictly less than this size.
Not really, exchange -> contentPartSizes.add((long) sendIncompleteContent(exchange, bytes)) does not send whole content, it's at least 1 byte short.
Also, rather than hard-coding the meaningful size to be bufferSize.getBytes() / 100L let's delegate the calculation to the repository somehow.
Done. Exposed meaningful progress and added retry counting in test. d3857cd
| int meaningfulProgressRetries = Math.toIntExact( | ||
| contentPartSizes.stream().filter(partSize -> partSize >= meaningfulProgressSize.get()).count() | ||
| ); | ||
| assertThat(exception.getSuppressed().length, getMaxRetriesMatcher(maxRetries + meaningfulProgressRetries)); |
There was a problem hiding this comment.
I think S3BlobContainerRetriesTest works around this problem by overriding getMaxRetriesMatcher
See
What you've done here seems like an improvement because it is more specific, perhaps we can remove that override and in-line getMaxRetriesMatcher as part of (or subsequent to) this change?
There was a problem hiding this comment.
Cool, I missed that override. Will remove this from S3 tests and make it similar to GCP.
There was a problem hiding this comment.
Done 8304536. getMaxRetriesMatcher is still in use in other place, otherwise I could remove it.
|
@nicktindall, @DaveCTurner, ready for another review |
| httpServer.createContext( | ||
| downloadStorageEndpoint(blobContainer, "read_blob_incomplete"), | ||
| exchange -> sendIncompleteContent(exchange, bytes) | ||
| exchange -> retryContentSizes.add((long) sendIncompleteContent(exchange, bytes)) |
There was a problem hiding this comment.
I wonder whether this might still flake, for example if the server thinks it sent a response > meaningful progress size, but due to buffering etc in the client infrastructure (they are all different I think), we might end up seeing < meaningful progress size in the client.
Assuming it doesn't, this LGTM
There was a problem hiding this comment.
I ran several successful tests with exact response size of meaningful bytes. I would expect client should not sit on buffered data and not giving it to the application. But can prefetch more if it's available. In this case what fixtures flushes to the socket should be immediately visible to the client.
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
Fix #139995
RetryingInputStreamallows more retries than maxRetry count configured for the blob store. If stream is able to read a meaningful amount of bytes (1% of read buffer size) a free retry is given to the stream that will not count into maxRetry.Fixing test by counting extra retries from
RetryingInputStream.