Skip to content

Fix testReadBlobWithReadTimeouts retries count#139999

Merged
mhl-b merged 12 commits intoelastic:mainfrom
mhl-b:testReadBlobWithReadTimeouts-fix
Jan 15, 2026
Merged

Fix testReadBlobWithReadTimeouts retries count#139999
mhl-b merged 12 commits intoelastic:mainfrom
mhl-b:testReadBlobWithReadTimeouts-fix

Conversation

@mhl-b
Copy link
Copy Markdown
Contributor

@mhl-b mhl-b commented Dec 25, 2025

Fix #139995

RetryingInputStream allows more retries than maxRetry count configured for the blob store. If stream is able to read a meaningful amount of bytes (1% of read buffer size) a free retry is given to the stream that will not count into maxRetry.

Fixing test by counting extra retries from RetryingInputStream.

@mhl-b mhl-b added >test Issues or PRs that are addressing/adding tests :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. labels Dec 25, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@mhl-b mhl-b requested a review from DaveCTurner December 29, 2025 21:46
Comment on lines +269 to +270
final int meaningfulProgressSize = (int) (bufferSize.getBytes() / 100L);
final byte[] bytesPerRetry = randomByteArrayOfLength(meaningfulProgressSize / maxRetries);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the computation here. We need each attempt to deliver strictly fewer than bufferSize.getBytes() / 100L bytes to avoid it being considered "meaningful" but why also divide by maxRetries? Does the GCS SDK retry attempt to resume the download internally up to maxRetries times? If so, that's different from the S3 SDK behaviour, and I'd prefer we made that explicit.

Also if maxRetries == 1 then sometimes this'll send exactly bufferSize.getBytes() / 100L which is large enough to be meaningful. It needs to be strictly less than this size.

Also, rather than hard-coding the meaningful size to be bufferSize.getBytes() / 100L let's delegate the calculation to the repository somehow. Note that GCS repositories don't allow any control over the buffer size, it's always GoogleCloudStorageBlobStore.SDK_DEFAULT_CHUNK_SIZE i.e. 16MiB regardless of our choice of bufferSize above.

Copy link
Copy Markdown
Contributor Author

@mhl-b mhl-b Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the computation here. We need each attempt to deliver strictly fewer than bufferSize.getBytes() / 100L bytes to avoid it being considered "meaningful" but why also divide by maxRetries?

I misunderstood currentStreamProgress, thanks for pointing out. Now I see that we compare offset of the RetryingInputStream with current try offset. I missed part where we update current offset during stream read.

    private long currentStreamProgress() {
        if (currentStream == null) {
            return 0L;
        }
        return Math.subtractExact(Math.addExact(start, offset), currentStream.getFirstOffset());
    }

Does the GCS SDK retry attempt to resume the download internally up to maxRetries times?

It does not. GCS should be the same as S3 right now.

Also if maxRetries == 1 then sometimes this'll send exactly bufferSize.getBytes() / 100L which is large enough to be meaningful. It needs to be strictly less than this size.

Not really, exchange -> contentPartSizes.add((long) sendIncompleteContent(exchange, bytes)) does not send whole content, it's at least 1 byte short.

Also, rather than hard-coding the meaningful size to be bufferSize.getBytes() / 100L let's delegate the calculation to the repository somehow.

Done. Exposed meaningful progress and added retry counting in test. d3857cd

@mhl-b mhl-b requested a review from DaveCTurner January 2, 2026 20:46
int meaningfulProgressRetries = Math.toIntExact(
contentPartSizes.stream().filter(partSize -> partSize >= meaningfulProgressSize.get()).count()
);
assertThat(exception.getSuppressed().length, getMaxRetriesMatcher(maxRetries + meaningfulProgressRetries));
Copy link
Copy Markdown
Contributor

@nicktindall nicktindall Jan 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think S3BlobContainerRetriesTest works around this problem by overriding getMaxRetriesMatcher

See

protected Matcher<Integer> getMaxRetriesMatcher(int maxRetries) {
// some attempts make meaningful progress and do not count towards the max retry limit
return allOf(greaterThanOrEqualTo(maxRetries), lessThanOrEqualTo(S3RetryingInputStream.MAX_SUPPRESSED_EXCEPTIONS));
}

What you've done here seems like an improvement because it is more specific, perhaps we can remove that override and in-line getMaxRetriesMatcher as part of (or subsequent to) this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I missed that override. Will remove this from S3 tests and make it similar to GCP.

Copy link
Copy Markdown
Contributor Author

@mhl-b mhl-b Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 8304536. getMaxRetriesMatcher is still in use in other place, otherwise I could remove it.

@mhl-b mhl-b requested a review from nicktindall January 9, 2026 20:11
@mhl-b
Copy link
Copy Markdown
Contributor Author

mhl-b commented Jan 9, 2026

@nicktindall, @DaveCTurner, ready for another review

httpServer.createContext(
downloadStorageEndpoint(blobContainer, "read_blob_incomplete"),
exchange -> sendIncompleteContent(exchange, bytes)
exchange -> retryContentSizes.add((long) sendIncompleteContent(exchange, bytes))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether this might still flake, for example if the server thinks it sent a response > meaningful progress size, but due to buffering etc in the client infrastructure (they are all different I think), we might end up seeing < meaningful progress size in the client.

Assuming it doesn't, this LGTM

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran several successful tests with exact response size of meaningful bytes. I would expect client should not sit on buffered data and not giving it to the application. But can prefetch more if it's available. In this case what fixtures flushes to the socket should be immediately visible to the client.

Copy link
Copy Markdown
Contributor

@nicktindall nicktindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with comment

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM2

@mhl-b mhl-b merged commit 9e9d277 into elastic:main Jan 15, 2026
35 checks passed
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Jan 21, 2026
@mhl-b
Copy link
Copy Markdown
Contributor Author

mhl-b commented Feb 2, 2026

💚 All backports created successfully

Status Branch Result
9.3

Questions ?

Please refer to the Backport tool documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] GoogleCloudStorageBlobContainerRetriesTests testReadBlobWithReadTimeouts failing

4 participants