Skip to content

feat: handle 429 properly on infer svc#7377

Merged
dancixx merged 5 commits intodevfrom
feat/infer-svc-429
Oct 13, 2025
Merged

feat: handle 429 properly on infer svc#7377
dancixx merged 5 commits intodevfrom
feat/infer-svc-429

Conversation

@dancixx
Copy link
Contributor

@dancixx dancixx commented Oct 9, 2025

Add proper 429 error handling in inference service.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

coderabbitai[bot]

This comment was marked as resolved.

@dancixx dancixx requested a review from generall October 10, 2025 10:48
coderabbitai[bot]

This comment was marked as resolved.

@timvisee timvisee self-requested a review October 10, 2025 12:02
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed6752f and 67eff5b.

📒 Files selected for processing (1)
  • src/common/inference/service.rs (6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • src/common/inference/service.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • src/common/inference/service.rs
🧬 Code graph analysis (1)
src/common/inference/service.rs (2)
src/actix/helpers.rs (1)
  • headers (207-239)
lib/storage/src/content_manager/errors.rs (17)
  • from (184-242)
  • from (246-248)
  • from (252-254)
  • from (258-264)
  • from (268-273)
  • from (277-282)
  • from (286-291)
  • from (295-300)
  • from (304-309)
  • from (313-318)
  • from (322-327)
  • from (331-336)
  • from (340-345)
  • from (349-354)
  • from (358-363)
  • from (367-372)
  • rate_limit_exceeded (103-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: integration-tests
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: test-consistency
  • GitHub Check: lint
🔇 Additional comments (3)
src/common/inference/service.rs (3)

192-216: Good: Retry-After threaded from headers to error handling

Capturing status + retry_after and passing it to handle_inference_response is clean and correct.


304-309: LGTM: 429 mapped to RateLimitExceeded with retry_after

This aligns with StorageError::rate_limit_exceeded and lets the Actix layer emit Retry-After correctly. Nice.


269-273: Signature change verified
All calls to handle_inference_response include the new retry_after argument; no outdated usages remain.

use std::time::Duration;
use std::time::{Duration, SystemTime};

use actix_web::http::header::HttpDate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Remove actix_web HttpDate import; use httpdate parser instead

HttpDate is unnecessary here and increases coupling to actix_web. Also, the current parsing below won't compile with HttpDate. Use the httpdate crate’s parser directly.

-use actix_web::http::header::HttpDate;
+// Use httpdate::parse_http_date in parse_retry_after
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use actix_web::http::header::HttpDate;
// Use httpdate::parse_http_date in parse_retry_after
🤖 Prompt for AI Agents
In src/common/inference/service.rs around line 6, remove the unused
actix_web::http::header::HttpDate import and replace it with the httpdate crate
usage; specifically delete that use line, add an import from the httpdate crate
(e.g. the parser function), and update the parsing call sites to call httpdate's
parser (parse_http_date) which returns a SystemTime/Result instead of using
actix_web's HttpDate type. Also ensure Cargo.toml includes the httpdate
dependency if not present.

Comment on lines +247 to +267
fn parse_retry_after(headers: &reqwest::header::HeaderMap) -> Option<Duration> {
headers
.get(reqwest::header::RETRY_AFTER)
.and_then(|value| value.to_str().ok())
.and_then(|value| {
// Check if the value is a valid duration in seconds
if let Ok(seconds) = value.parse::<u64>() {
return Some(Duration::from_secs(seconds));
}

// Check if the value is a valid Date
if let Ok(http_date) = value.parse::<HttpDate>() {
let ts = SystemTime::from(http_date);
if let Ok(diff) = ts.duration_since(SystemTime::now()) {
return Some(diff);
}
}

None
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix Retry-After HTTP-date parsing (compile error + wrong API)

  • HttpDate likely doesn’t implement FromStr; value.parse::() fails.
  • SystemTime::from(HttpDate) is not implemented (the reverse is).
  • Use httpdate::parse_http_date(value) which returns SystemTime and avoids actix_web dependency here.

Apply:

 fn parse_retry_after(headers: &reqwest::header::HeaderMap) -> Option<Duration> {
     headers
         .get(reqwest::header::RETRY_AFTER)
         .and_then(|value| value.to_str().ok())
         .and_then(|value| {
-                // Check if the value is a valid duration in seconds
-                if let Ok(seconds) = value.parse::<u64>() {
-                    return Some(Duration::from_secs(seconds));
-                }
-
-                // Check if the value is a valid Date
-                if let Ok(http_date) = value.parse::<HttpDate>() {
-                    let ts = SystemTime::from(http_date);
-                    if let Ok(diff) = ts.duration_since(SystemTime::now()) {
-                        return Some(diff);
-                    }
-                }
-
-                None
+                // delay-seconds
+                if let Ok(seconds) = value.parse::<u64>() {
+                    return Some(Duration::from_secs(seconds));
+                }
+                // HTTP-date
+                if let Ok(ts) = httpdate::parse_http_date(value) {
+                    if let Ok(diff) = ts.duration_since(SystemTime::now()) {
+                        return Some(diff);
+                    }
+                }
+                None
         })
 }

Optional: if the HTTP-date is in the past, consider clamping to Some(Duration::from_secs(0)) instead of None so downstream can still set Retry-After: 0.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fn parse_retry_after(headers: &reqwest::header::HeaderMap) -> Option<Duration> {
headers
.get(reqwest::header::RETRY_AFTER)
.and_then(|value| value.to_str().ok())
.and_then(|value| {
// Check if the value is a valid duration in seconds
if let Ok(seconds) = value.parse::<u64>() {
return Some(Duration::from_secs(seconds));
}
// Check if the value is a valid Date
if let Ok(http_date) = value.parse::<HttpDate>() {
let ts = SystemTime::from(http_date);
if let Ok(diff) = ts.duration_since(SystemTime::now()) {
return Some(diff);
}
}
None
})
}
fn parse_retry_after(headers: &reqwest::header::HeaderMap) -> Option<Duration> {
headers
.get(reqwest::header::RETRY_AFTER)
.and_then(|value| {
// delay-seconds
if let Ok(seconds) = value.parse::<u64>() {
return Some(Duration::from_secs(seconds));
}
// HTTP-date
if let Ok(ts) = httpdate::parse_http_date(value) {
if let Ok(diff) = ts.duration_since(SystemTime::now()) {
return Some(diff);
}
}
None
})
}
🤖 Prompt for AI Agents
In src/common/inference/service.rs around lines 247 to 267, the current
Retry-After parsing uses value.parse::<HttpDate>() and
SystemTime::from(HttpDate) which do not compile; replace that branch to call
httpdate::parse_http_date(value) to get a SystemTime, compute duration =
parsed_ts.duration_since(SystemTime::now()).map(|d| d).unwrap_or_else(|_|
Duration::from_secs(0)) (i.e. clamp past dates to zero) and return
Some(duration); keep the existing numeric-seconds branch and ensure you add the
httpdate crate import/feature if not present.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/common/inference/service.rs (1)

6-6: Fix Retry-After HTTP-date parsing: drop actix_web::HttpDate; use httpdate and clamp past dates to 0 (compile fix).

The current code won’t compile: HttpDate likely doesn’t implement FromStr and SystemTime::from(HttpDate) isn’t available. Also, the .ok().map(...) path drops past dates instead of clamping to zero.

Apply:

- use actix_web::http::header::HttpDate;
+ // Parse HTTP-date without coupling to actix-web
+ use httpdate::parse_http_date;
 fn parse_retry_after(headers: &reqwest::header::HeaderMap) -> Option<Duration> {
     headers
         .get(reqwest::header::RETRY_AFTER)
         .and_then(|value| value.to_str().ok())
         .and_then(|value| {
             // Check if the value is a valid duration in seconds
             if let Ok(seconds) = value.parse::<u64>() {
                 return Some(Duration::from_secs(seconds));
             }
 
-            // Check if the value is a valid Date
-            if let Ok(http_date) = value.parse::<HttpDate>() {
-                let ts = SystemTime::from(http_date);
-                return ts
-                    .duration_since(SystemTime::now())
-                    .ok()
-                    .map(|d| d.max(Duration::ZERO));
-            }
+            // HTTP-date form; clamp past dates to zero
+            if let Ok(ts) = parse_http_date(value) {
+                let delay = match ts.duration_since(SystemTime::now()) {
+                    Ok(d) => d,
+                    Err(_) => Duration::ZERO,
+                };
+                return Some(delay);
+            }
 
             None
         })
 }

Add dependency (if not present):

# Cargo.toml
[dependencies]
httpdate = "1"

Also applies to: 247-268

🧹 Nitpick comments (2)
src/common/inference/service.rs (2)

311-316: Optional: propagate Retry-After on 503/5xx as well.

Servers often include Retry-After with 503 (Service Unavailable). Consider surfacing it to clients (either by adding retry_after to ServiceError or mapping 503+Retry-After to a variant that carries it) so Actix can emit the header.

Would you like me to draft a minimal change to plumb retry_after through ServiceError and set the header in src/actix/helpers.rs for 503?


247-268: Add targeted tests for Retry-After parsing (seconds and HTTP-date).

To prevent regressions, add unit tests covering both forms and past HTTP-date (clamped to 0). I can open a test PR if helpful.

Example tests to add in this file’s test module:

#[test]
fn test_parse_retry_after_seconds() {
    let mut headers = reqwest::header::HeaderMap::new();
    headers.insert(reqwest::header::RETRY_AFTER, "7".parse().unwrap());
    let d = InferenceService::parse_retry_after(&headers);
    assert_eq!(d, Some(Duration::from_secs(7)));
}

#[test]
fn test_parse_retry_after_http_date_future() {
    let mut headers = reqwest::header::HeaderMap::new();
    let ts = std::time::SystemTime::now() + Duration::from_secs(3);
    let v = httpdate::fmt_http_date(ts);
    headers.insert(reqwest::header::RETRY_AFTER, v.parse().unwrap());
    let d = InferenceService::parse_retry_after(&headers).unwrap();
    assert!(d.as_secs() <= 3 && d.as_secs() >= 1);
}

#[test]
fn test_parse_retry_after_http_date_past_clamped() {
    let mut headers = reqwest::header::HeaderMap::new();
    let ts = std::time::SystemTime::now() - Duration::from_secs(10);
    let v = httpdate::fmt_http_date(ts);
    headers.insert(reqwest::header::RETRY_AFTER, v.parse().unwrap());
    let d = InferenceService::parse_retry_after(&headers);
    assert_eq!(d, Some(Duration::ZERO));
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67eff5b and 3c14f5c.

📒 Files selected for processing (1)
  • src/common/inference/service.rs (6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • src/common/inference/service.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • src/common/inference/service.rs
🧬 Code graph analysis (1)
src/common/inference/service.rs (2)
src/actix/helpers.rs (1)
  • headers (207-239)
lib/storage/src/content_manager/errors.rs (17)
  • from (184-242)
  • from (246-248)
  • from (252-254)
  • from (258-264)
  • from (268-273)
  • from (277-282)
  • from (286-291)
  • from (295-300)
  • from (304-309)
  • from (313-318)
  • from (322-327)
  • from (331-336)
  • from (340-345)
  • from (349-354)
  • from (358-363)
  • from (367-372)
  • rate_limit_exceeded (103-111)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: lint
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: integration-tests-consensus
🔇 Additional comments (3)
src/common/inference/service.rs (3)

192-217: Remote call flow and Retry-After extraction look solid.

Headers are read before body consumption; retry-after is correctly threaded into handling. No concerns.


270-274: Signature change to pass retry_after is correct.

Clean API; keeps all context needed for downstream mapping.


305-310: Good: 429 mapped to RateLimitExceeded with retry_after.

This aligns with the new StorageError constructor and allows Actix to emit Retry-After.

@dancixx dancixx merged commit 3c6f26e into dev Oct 13, 2025
15 checks passed
@dancixx dancixx deleted the feat/infer-svc-429 branch October 13, 2025 09:04
@qdrant qdrant deleted a comment from coderabbitai bot Oct 15, 2025
timvisee pushed a commit that referenced this pull request Nov 14, 2025
* feat: handle 429 properly

* feat: change to rate limit error

* feat: handle http_date

* chore: remove unsued imports

* chore: make it more readable
@timvisee timvisee mentioned this pull request Nov 14, 2025
@coderabbitai coderabbitai bot mentioned this pull request Nov 25, 2025
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants