Skip to content

fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client.#1445

Merged
LinZhihao-723 merged 10 commits into
y-scope:mainfrom
LinZhihao-723:s3-client-fix
Oct 20, 2025
Merged

fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client.#1445
LinZhihao-723 merged 10 commits into
y-scope:mainfrom
LinZhihao-723:s3-client-fix

Conversation

@LinZhihao-723

@LinZhihao-723 LinZhihao-723 commented Oct 19, 2025

Copy link
Copy Markdown
Member

Description

In the Log Ingestor prototyping, we later realized that creating a new client without a default configuration would cause connection loss very frequently when accessing the AWS-hosted S3 service. This is because that a client configured without the default config doesn't set HTTP behavior, timeouts, and credentials resolution.

This PR fixes this problem by introducing AWS SDK's "latest" behavior version as the default config.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Tested in the prototype repo to ensure clients with the default config are stable to use.
  • Ensure all workflows pass.

Summary by CodeRabbit

  • Refactor
    • Updated AWS SDK integration and dependencies to use current defaults.
    • S3 client creation converted to an asynchronous flow for better startup handling.
    • Improved S3 endpoint compatibility by enforcing path-style addressing.
    • Credentials handling updated for clearer provider identification and more reliable configuration.

@coderabbitai

coderabbitai Bot commented Oct 19, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Adds aws-config and updates the S3 client factory to be asynchronous: it loads AWS defaults with BehaviorVersion::latest(), builds the S3 config (forcing path-style addressing), and returns an S3 Client. Cargo dependency and SDK config usage are adjusted.

Changes

Cohort / File(s) Summary
AWS configuration and S3 client async migration
components/clp-rust-utils/Cargo.toml, components/clp-rust-utils/src/s3/client.rs
Adds aws-config = "1.8.8"; removes explicit S3 feature behavior-version-latest from aws-sdk-s3 dependency; converts create_new_client to async and uses aws_config::defaults(BehaviorVersion::latest()).load().await; builds S3 Config via Builder::from(&base_config) with force_path_style(true); credentials provider name changed to "clp-credential-provider"; preserves endpoint, region, and SecretString handling.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant S3Factory as create_new_client()
  participant AWSConfig as aws_config::defaults().load().await
  participant S3SDK as aws_sdk_s3::Client

  Caller->>S3Factory: call create_new_client(endpoint, region, creds)
  S3Factory->>AWSConfig: load base_config (BehaviorVersion::latest())
  AWSConfig-->>S3Factory: base_config
  S3Factory->>S3SDK: build Config (Builder::from(&base_config), force_path_style=true)
  S3SDK-->>S3Factory: Client
  S3Factory-->>Caller: return Client
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client" accurately reflects the primary objective of the changeset. The title is specific and clear, directly describing the main modification: updating the S3 client creation to use AWS SDK's default configuration with the latest behavior version. The title aligns well with the actual changes made in Cargo.toml and src/s3/client.rs, where aws-config dependency is introduced and create_new_client is updated to use aws_config::defaults(BehaviorVersion::latest()). This title would clearly communicate the core intent of the PR to someone reviewing commit history.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 317bad8 and 48fef4e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • components/clp-rust-utils/Cargo.toml (1 hunks)
  • components/clp-rust-utils/src/s3/client.rs (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: package-image
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (ubuntu-22.04)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: rust-checks (macos-15)
🔇 Additional comments (6)
components/clp-rust-utils/src/s3/client.rs (5)

1-1: Import necessary for the new implementation.

The addition of BehaviorVersion is required for using BehaviorVersion::latest() in the config loading.


10-14: Excellent documentation of configuration choices.

The added notes clearly document the use of the latest behavior version and path-style addressing, which helps future maintainers understand the S3 client configuration.


19-19: Breaking change to async signature is necessary and acceptable.

The function signature change to async is required for loading the AWS default configuration. Based on previous review discussions, there are no current callers of this function, making this breaking change safe to introduce.


30-30: Improved provider name is more descriptive.

The change from "clp-user" to "clp-credential-provider" is a good improvement that makes the provider name more descriptive and follows better naming conventions.


33-34: Core implementation correctly addresses the PR objective.

The use of aws_config::defaults(BehaviorVersion::latest()).load().await followed by Builder::from(&base_config) ensures that the S3 client is configured with proper HTTP behavior, timeouts, and credential resolution defaults. This addresses the connection loss issue described in the PR objectives while still allowing explicit credentials and path-style addressing to be configured.

components/clp-rust-utils/Cargo.toml (1)

7-8: Consider updating aws-sdk-s3 to the latest stable version.

Official AWS documentation recommends aws-sdk-s3 version 1.108.0, but the Cargo.toml specifies 1.106.0. While the versions are compatible, updating to 1.108.0 would align with AWS's latest stable release. The change from behavior-version-latest feature to explicit BehaviorVersion::latest() in code is correct per the AWS documentation on behavior versions.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LinZhihao-723 LinZhihao-723 marked this pull request as ready for review October 19, 2025 20:38
@LinZhihao-723 LinZhihao-723 requested a review from a team as a code owner October 19, 2025 20:38

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e1c717 and 317bad8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • components/clp-rust-utils/Cargo.toml (1 hunks)
  • components/clp-rust-utils/src/s3/client.rs (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: package-image
  • GitHub Check: rust-checks (ubuntu-24.04)
  • GitHub Check: rust-checks (macos-15)
  • GitHub Check: rust-checks (ubuntu-22.04)
🔇 Additional comments (4)
components/clp-rust-utils/src/s3/client.rs (3)

1-1: LGTM!

The import of BehaviorVersion is necessary for configuring the AWS SDK with the latest behavior version.


10-16: LGTM!

The documentation clearly describes the client's configuration behavior, including the use of the latest AWS SDK behavior version and path-style addressing enforcement.


19-24: No existing callers found in the codebase to verify.

A comprehensive search across all Rust files reveals that create_new_client is defined and re-exported but has no actual invocation sites within the current codebase. The log-ingestor appears to be a stub project. Verify whether this async conversion is part of a broader set of changes being added in this PR or whether callers exist in external code (such as the mentioned prototype repository). If callers are added separately, ensure they use .await syntax appropriately.

components/clp-rust-utils/Cargo.toml (1)

7-7: Version 1.8.8 is current, secure, and compatible.

aws-config 1.8.8 is the latest version on crates.io, with no published CVE or advisory specifically naming this version as vulnerable. aws-config 1.8.8 is compatible with aws-sdk-s3 1.106.0, as both are part of the AWS Rust SDK v1 family. The dependency addition is appropriate.

Comment thread components/clp-rust-utils/src/s3/client.rs

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the current pr title, it says using ... version ... as config -

if i understand correctly, the timeout / retry configuration are now set because we have constructed a default config using aws_config::defaults()? if so, would it be more clear to say:

fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client.

/// A newly created S3 client.
#[must_use]
pub fn create_new_client(
pub async fn create_new_client(

@junhaoliao junhaoliao Oct 20, 2025

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed as async which is an internal breaking change -

i confirmed that there's no usage of the function at the moment and therefore there's no caller to be updated, so the change looks correct

seeing Client::from_conf() might panic, i'm not sure if we should wrap things and let this return a Result though

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked into the panic section for from_config: https://docs.rs/aws-sdk-s3/latest/aws_sdk_s3/struct.Client.html#panics
It looks like these panics happen only when the given configuration is invalid. The failure should be deterministic as it doesn't involve any network operations or credential validations. Using the default config + latest behavior version should be safe. On the other hand, if this API is supposed to return a resolvable error, it should be designed to return a Result already like other AWS APIs. So I think it should be ok to not return a Result and just let it panic. Let me know if you still want this to return a Result.

@junhaoliao junhaoliao Oct 20, 2025

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, i think it's okay to keep the return type of create_new_client not Result.

On the other hand, if this API is supposed to return a resolvable error, it should be designed to return a Result already like other AWS APIs.

Just to confirm, if there are issues in the connection parameters like malformed credentials or region code, they eventually would be reported in Results by the other Rust SDK APIs that query the AWS APIs, right? Like we can create a Client with malformed connection parameters but there would be issues only when we list buckets with the client?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, those errors are not handled by the Client object creation.

Comment thread components/clp-rust-utils/src/s3/client.rs
Comment thread components/clp-rust-utils/src/s3/client.rs Outdated
Comment thread components/clp-rust-utils/Cargo.toml Outdated
@LinZhihao-723 LinZhihao-723 changed the title fix(clp-rust-utils): Configure S3 client using the latest AWS SDK behavior version as the default config. fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. Oct 20, 2025
@LinZhihao-723 LinZhihao-723 merged commit 8d241c1 into y-scope:main Oct 20, 2025
22 checks passed
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

* Stupid fix

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…havior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants