Surface server Retry-After for rate-limit send failures#2016
Merged
Conversation
libsignal-service's RateLimitException exposes retryAfterMilliseconds for HTTP 413 responses, but signal-cli only forwarded retry-after for ProofRequired (428) failures. Clients had no signal for when it was safe to retry plain rate-limited sends, so every failed retry potentially extended the server-side window. SendMessageResult now carries an optional rateLimitRetryAfterSeconds, populated from the upstream Optional<Long>. JsonSendMessageResult exposes it for RATE_LIMIT_FAILURE type. Text output includes the window when known. Aggregate RateLimitErrorException now carries the real nextAttemptTimestamp (was hardcoded to 0). Closes AsamK#1996. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…millis Codex adversarial review flagged two issues in the phase 1 retry-after plumbing: * Aggregate retry-after ignored proof-required failures. Because isRateLimitFailure is true for proof-required cases but rateLimitRetryAfterSeconds was only populated from plain 413s, an all-proof-required batch (or a mixed batch where the proof-required delay was longer) could flow into outputResult() and produce a RateLimitException(0), telling callers to retry immediately. * Millisecond Retry-After values were truncated by integer division, so 1..999ms became 0 and non-second-aligned values lost up to 999ms. A retry suggested from the floored value can land before the server's real deadline and re-trigger the limit. SendMessageResult.from(...) now populates rateLimitRetryAfterSeconds from either the proof-required seconds or the plain rate-limit ms (converted via ceiling division), giving maxRateLimitRetryAfterSeconds a single source of truth. JsonSendMessageResult.from(...) reads the unified field. New millisToCeilingSeconds helper plus boundary test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a non-canonical 8-arg SendMessageResult constructor that delegates to the canonical form with null retry-after. This keeps source compatibility for any downstream code that constructs the record directly (tests, mocks) without changing the canonical shape. Records permit additional constructors alongside the canonical one. Document the retryAfterSeconds meaning change in the CHANGELOG. The field was previously populated only for proof-required failures; it is now populated whenever the server sends a Retry-After header. The canonical proof-required discriminator is still token != null. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c255551 to
0b1261c
Compare
Owner
|
Thanks! |
15 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
SendMessageResultnow carries arateLimitRetryAfterSecondscomponent populated from libsignal'sRateLimitException.getRetryAfterMilliseconds()(for plain 413s) orProofRequiredException.getRetryAfterSeconds()(for 428 challenges), giving clients a single place to read the server's advertised retry window.JsonSendMessageResult.retryAfterSecondsis now populated for everyRATE_LIMIT_FAILUREwhere the server sentRetry-After, not only for proof-required challenges. The canonical proof-required discriminator remainstoken != null."retry after N seconds"to rate-limit failure lines when known.RateLimitErrorExceptionthrown when all recipients rate-limit now carries the realnextAttemptTimestamp = now + retryAfter * 1000instead of the hardcoded0placeholder.Motivation
Closes #1996.
Before this change, clients had no way to know when it was safe to retry a rate-limited send. Bridges, bots, and other daemons would either retry too eagerly (extending the server-side window) or wait an arbitrary amount. The data was already available upstream in libsignal — signal-cli just wasn't forwarding it.
Design notes
rateLimitRetryAfterSecondscomponent, so downstream code doesn't need to branch on failure subtype. A newSendMessageResults.maxRateLimitRetryAfterSeconds()helper returns the longest wait across a batch.SendMessageResult.millisToCeilingSeconds(long)is a small package-private helper with its own boundary tests.Test plan
./gradlew buildgreenJsonSendMessageResultTestcovers: retry-after surfacing on plain 413, null retry-after when server omits Retry-After, successful sends leaving the field null, aggregate picking the longest window across a mix of populated/null entriesSendMessageResultTestcovers: ms→s ceiling boundaries (0, 1, 500, 999, 1000, 1001, 1500, 60000), and the 8-arg legacy constructor still compiles and produces a null retry-aftertype: SUCCESSwith no retryAfterSeconds, as expected)The JSON contract extension (populating
retryAfterSecondsfor plain 413s where it was previously null) is documented inCHANGELOG.mdunder Unreleased.