feat(api-server): Document APIs using OpenAPI and serve on /openapi.json endpoint.#1663
Conversation
WalkthroughCentralizes Axum route definitions into a new Changes
Sequence Diagram(s)sequenceDiagram
participant Browser as API Client
participant Router as Axum Router
participant OpenAPI as OpenAPI Layer
participant Handler as Route Handler
participant CLP as CLP Client
Browser->>Router: GET /openapi.json
Router->>OpenAPI: Request ApiDoc JSON
OpenAPI-->>Browser: Return OpenAPI JSON
Browser->>Router: POST /query (QueryConfig)
Router->>Handler: invoke query handler
Handler->>CLP: submit query with config
CLP-->>Handler: return QueryResultsUri
Handler-->>Browser: 200 OK with results URI
Browser->>Router: GET /query_results/{id}
Router->>Handler: invoke streaming handler
Handler->>CLP: request stream
loop per result line
CLP-->>Handler: single-line JSON
Handler-->>Browser: SSE event (JSON)
end
Note over Router: CORS applied to all routes
sequenceDiagram
participant App as Application (bin)
participant Factory as routes::from_client()
participant Router as Axum Router
participant State as Router State
App->>Factory: from_client(client)
Factory->>Router: build OpenApiRouter
Factory->>Router: register handlers (/health, /query, /query_results)
Factory->>Router: add /openapi.json
Factory->>Router: apply CORS middleware
Factory->>State: attach CLP client as state
Factory-->>App: return configured Router (Result)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/api-server/src/bin/api_server.rs (2)
193-200: **Multi-line messages will cause500 Internal Server Error.**Log messages containing newlines will cause a500 Internal Server Error. "When the EventSource receives multiple consecutive lines that begin with data:, it concatenates them, inserting a newline character between each one." The SSE specification supports multi-line data by sending multipledata:lines.Consider handling multi-line messages properly instead of rejecting them:
let event_stream = results_stream.map(|res| { let message = res?; let trimmed_message = message.trim(); - if trimmed_message.lines().count() != 1 { - return Err(HandlerError::InternalServer); - } - Ok(Event::default().data(trimmed_message)) + // Axum's Event::data handles multi-line content by sending multiple data: lines + Ok(Event::default().data(trimmed_message)) });Alternatively, if single-line is a deliberate constraint, return a more descriptive error or escape newlines within JSON.
217-221: Error context is discarded.The blanket
From<T>implementation converts all errors toInternalServerwithout preserving the original error. This makes debugging production issues difficult. Consider logging the original error before conversion.impl<T: IntoHandlerError> From<T> for HandlerError { - fn from(_: T) -> Self { + fn from(err: T) -> Self { + tracing::error!("Handler error: {:?}", std::any::type_name::<T>()); Self::InternalServer } }For more detailed logging, you could use a trait bound like
std::fmt::Debug:impl<T: IntoHandlerError + std::fmt::Debug> From<T> for HandlerError { fn from(err: T) -> Self { tracing::error!("Handler error: {:?}", err); Self::InternalServer } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
components/api-server/Cargo.toml(1 hunks)components/api-server/README.md(1 hunks)components/api-server/src/bin/api_server.rs(3 hunks)components/api-server/src/client.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
Applied to files:
components/api-server/src/client.rscomponents/api-server/src/bin/api_server.rs
🪛 markdownlint-cli2 (0.18.1)
components/api-server/README.md
12-12: Blank line inside blockquote
(MD028, no-blanks-blockquote)
15-15: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
56-56: Blank line inside blockquote
(MD028, no-blanks-blockquote)
91-91: Blank line inside blockquote
(MD028, no-blanks-blockquote)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: package-image
- GitHub Check: rust-checks
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (3)
components/api-server/src/client.rs (1)
16-21: LGTM!The
ToSchemaderive is correctly added to enable OpenAPI schema generation forQueryConfig, which is used as the request body for the/queryendpoint.components/api-server/README.md (1)
1-3: Documentation provides good coverage of the API.The README effectively documents the endpoints and schemas. For the Swagger UI link, consider noting that users need to enter the full URL including protocol.
components/api-server/src/bin/api_server.rs (1)
103-109: OpenAPI router setup looks correct.The setup registers
/health,/query, and/query_resultsendpoints with OpenAPI metadata. The root path/also serves the health check, which is a common pattern for load balancer probes.
| tower-http = { version = "0.6.7", features = ["cors"] } | ||
| tracing = "0.1.41" | ||
| tracing-appender = "0.2.3" | ||
| tracing-subscriber = { version = "0.3.20", features = ["json", "env-filter", "fmt", "std"] } | ||
| utoipa = { version = "5.4.0", features = ["axum_extras"] } | ||
| utoipa-axum = "0.2.0" |
There was a problem hiding this comment.
🧩 Analysis chain
utoipa-axum crate latest version 2024 2025
<web_search_result>
As of Nov 25, 2025 the latest published version of the utoipa-axum crate is 0.2.0. [1][2]
Sources:
- docs.rs crate page for utoipa-axum (shows v0.2.0). [1]
- deps.rs / crate metadata for utoipa-axum v0.2.0. [2]
</web_search_result>
utoipa crate version 5.4.0 latest 2024 2025
<web_search_result>
Yes — utoipa 5.4.0 was released June 16, 2025. Prior releases: 5.3.0 on Dec 19, 2024 (and 5.3.1 on Jan 5, 2025). Sources: [1][2][3].
Sources
- Release list (lib.rs) — shows 5.4.0 (Jun 16, 2025) and 5.3.0 (Dec 19, 2024). [1]
- docs.rs page for utoipa 5.4.0. [2]
- utoipa-gen changelog showing 5.4.0 entry (Jun 16, 2025). [3]
</web_search_result>
tower-http crate version 0.6.7 latest 2024 2025
<web_search_result>
As of today (2025-11-25) the latest published tower-http release is 0.6.6 (released Jun 3, 2025). I couldn't find a 0.6.7 release. [1][2]
Sources:
- docs.rs/tower-http (shows 0.6.6 builds). [1]
- lib.rs (crate versions listing, latest 0.6.6). [2]
</web_search_result>
tower-http version 0.6.7 does not exist — the latest published version is 0.6.6 (released June 3, 2025).
This will cause a build failure. Update line 32 to use "0.6.6" instead.
The other versions are correct:
- utoipa 5.4.0 is the latest (released June 16, 2025)
- utoipa-axum 0.2.0 is the latest (released November 25, 2025)
🤖 Prompt for AI Agents
In components/api-server/Cargo.toml around lines 32 to 37, the dependency entry
for tower-http specifies version "0.6.7" which does not exist and will break the
build; change that version to "0.6.6" (the latest published 0.6.x), leaving the
other dependency lines unchanged so the file uses tower-http = { version =
"0.6.6", features = ["cors"] } instead of "0.6.7".
| ``` | ||
| "API server is running" | ||
| ``` |
There was a problem hiding this comment.
Add language specifier to fenced code block.
The code block is missing a language specifier. Since this is a plain text response example, use text or plain to satisfy linting and improve rendering consistency.
-```
+```text
"API server is running"
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>
15-15: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
In components/api-server/README.md around lines 15 to 17, the fenced code block
lacks a language specifier; update the block to include a plain text language
tag (e.g., "text" or "plain") so it reads as a fenced block with a language
specifier — replace the existing wrapper withtext (or ```plain) and keep
the inner content unchanged.
</details>
<!-- This is an auto-generated comment by CodeRabbit -->
| "/openapi.json", | ||
| get(|| async { (StatusCode::OK, api_json) }), | ||
| ) | ||
| .layer(CorsLayer::new().allow_origin(Any).allow_headers(Any)); |
There was a problem hiding this comment.
CORS configuration is very permissive.
allow_origin(Any).allow_headers(Any) allows requests from any origin with any headers. While this enables Swagger UI and similar tools to work, it also permits cross-origin requests from any website. Consider:
- Adding
.allow_methods([Method::GET, Method::POST])to explicitly allow only the required methods. - Documenting that this permissive configuration is intentional for development/tooling purposes.
- For production deployments, consider making the allowed origins configurable.
- .layer(CorsLayer::new().allow_origin(Any).allow_headers(Any));
+ .layer(
+ CorsLayer::new()
+ .allow_origin(Any)
+ .allow_headers(Any)
+ .allow_methods([axum::http::Method::GET, axum::http::Method::POST]),
+ );You'll need to add the import:
use axum::http::Method;🤖 Prompt for AI Agents
In components/api-server/src/bin/api_server.rs around line 116, the CORS layer
is overly permissive using allow_origin(Any).allow_headers(Any); update it to
restrict methods by adding .allow_methods([Method::GET, Method::POST]) (or
whichever methods are required), make allowed origins configurable via
environment/config (replace Any with a configurable list in production), and add
a comment documenting that the current permissive settings are intentional for
development/Swagger only; also add the import use axum::http::Method; at the top
of the file.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/api-server/src/bin/api_server.rs (2)
175-203: Consider logging malformed SSE messages before returningInternalServerIn
query_results, whentrimmed_message.lines().count() != 1you immediately returnHandlerError::InternalServer(Line 199). That protects clients from malformed/multi-line events, but without logging it will be hard to diagnose why the stream failed.Consider adding a
tracing::error!(includingsearch_job_idand perhaps the raw message length) before returning the error so you have observability when upstream produces bad data.
212-222: SimplifyIntoHandlerError/Frompattern for clarityUsing an empty
IntoHandlerErrormarker trait plus a blanketimpl<T: IntoHandlerError> From<T> for HandlerErrorworks, but it hides exactly which error types are converted and can make future additions less explicit.You could simplify this by dropping the marker trait and writing concrete impls instead:
-enum HandlerError { - #[error("Internal server error")] - InternalServer, -} - -trait IntoHandlerError {} - -impl IntoHandlerError for axum::Error {} -impl IntoHandlerError for clp_client::ClientError {} - -impl<T: IntoHandlerError> From<T> for HandlerError { - fn from(_: T) -> Self { - Self::InternalServer - } -} +enum HandlerError { + #[error("Internal server error")] + InternalServer, +} + +impl From<axum::Error> for HandlerError { + fn from(_: axum::Error) -> Self { + HandlerError::InternalServer + } +} + +impl From<clp_client::ClientError> for HandlerError { + fn from(_: clp_client::ClientError) -> Self { + HandlerError::InternalServer + } +}This keeps behaviour the same while making the mapping between concrete errors and
HandlerErroreasier to follow.
♻️ Duplicate comments (1)
components/api-server/src/bin/api_server.rs (1)
104-121: Tighten CORS and serve/openapi.jsonwith proper JSON content-typeTwo points here:
/openapi.jsoncurrently returns a rawString, which Axum will serve astext/plain. For OpenAPI consumers it’s better to return a JSON body (application/json) viaJson.CorsLayer::new().allow_origin(Any)enables all origins for every route, which is quite permissive for production. It’s fine for local Swagger usage, but consider tightening it (at least to explicit methods, and ideally to configurable origins).A possible adjustment:
+use axum::http::Method; @@ - let (router, api) = OpenApiRouter::with_openapi(ApiDoc::openapi()) + let (router, api) = OpenApiRouter::with_openapi(ApiDoc::openapi()) .route("/", get(health)) .routes(routes!(health)) .routes(routes!(query)) .routes(routes!(query_results)) .with_state(client) .split_for_parts(); - let api_json = api.to_json()?; - let router = router - .route( - "/openapi.json", - get(|| async { (StatusCode::OK, api_json) }), - ) - .layer(CorsLayer::new().allow_origin(Any)); + let api = api.clone(); + let router = router + .route( + "/openapi.json", + // Serve OpenAPI JSON with the correct content type. + get(move || { + let api = api.clone(); + async move { Json(api) } + }), + ) + .layer( + CorsLayer::new() + .allow_origin(Any) // consider making this configurable for production + .allow_methods([Method::GET, Method::POST]), + );This keeps current behaviour for tooling while improving correctness and reducing some CORS surface area; for production you may still want to drive the allowed origins from config.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/api-server/src/bin/api_server.rs(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
Applied to files:
components/api-server/src/bin/api_server.rs
🔇 Additional comments (2)
components/api-server/src/bin/api_server.rs (2)
11-21: New routing/OpenAPI imports look correctThe added imports for
routing::get,CorsLayer,Any,OpenApi, andOpenApiRouter/routesline up with their usage below; nothing problematic here.
143-173: Based on my investigation, I cannot access the repository to verify the actual imports incomponents/api-server/src/bin/api_server.rsdue to cloning failures in the sandbox environment. However, the web search confirms that the review comment's concern is technically valid: thejson!macro fromserde_jsonmust be explicitly imported or qualified in scope to use it within utoipa attribute macros.The documentation shows that
use serde_json::json;is the standard required import when usingjson!(...)in utoipa path annotations.
Verify
json!macro visibility inqueryOpenAPI annotationThe annotation at lines 157–162 uses
example = json!({"query_results_uri":"/query_results/1"}), which requiresserde_json::jsonto be in scope. Confirm that one of the following exists in the file:
use serde_json::json;at module level, or#[macro_use] extern crate serde_json;at crate level, or- Qualification as
serde_json::json!(...)If none of these are present, add:
+use serde_json::json;
| #[derive(utoipa::OpenApi)] | ||
| #[openapi(info( | ||
| title = "API Server", | ||
| description = "API Server for CLP", | ||
| contact(name = "YScope") | ||
| ))] | ||
| struct ApiDoc; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Enrich OpenAPI metadata (version, servers, etc.)
ApiDoc currently sets only title/description/contact. For external consumers it’s often useful to also specify version and servers (and optionally license, terms_of_service) in the #[openapi(info(...))] block so clients can better understand the API surface and deployment environment. This is optional but improves docs quality.
🤖 Prompt for AI Agents
In components/api-server/src/bin/api_server.rs around lines 126 to 132, the
#[openapi(info(...))] block for ApiDoc only includes title/description/contact;
update that info block to also provide a version string and a servers array (and
optionally license and terms_of_service) so the generated OpenAPI metadata
includes API version and deployment server URLs—add entries like version =
"x.y.z" and servers([OpenApiServer::new("https://api.example.com")]) or
equivalent attributes supported by utoipa, keeping the existing fields intact.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
components/api-server/src/bin/api_server.rs (3)
199-206: Log errors before returning HandlerError for better observability.When a multi-line message is encountered (Line 203), the code returns
HandlerError::InternalServerwithout logging the problematic message or any context. This makes debugging production issues difficult.Apply this diff to add logging:
let event_stream = results_stream.map(|res| { let message = res?; let trimmed_message = message.trim(); if trimmed_message.lines().count() != 1 { + tracing::error!("Received multi-line message in SSE stream: {}", trimmed_message); return Err(HandlerError::InternalServer); } Ok(Event::default().data(trimmed_message)) });
223-227: Generic error conversion loses all diagnostic context.The
From<T>implementation (lines 223-227) converts all errors intoHandlerError::InternalServerwithout preserving any error details. This severely impacts observability and makes debugging production issues nearly impossible.Apply this diff to add logging before error conversion:
impl<T: IntoHandlerError> From<T> for HandlerError { - fn from(_: T) -> Self { + fn from(error: T) -> Self { + tracing::error!("Handler error: {:?}", error); Self::InternalServer } }Note: This requires
T: IntoHandlerError + std::fmt::Debug. You may need to addDebugbounds to the trait implementations or use separateFromimplementations for each error type.
217-221: Consider removing the marker trait for simpler error handling.The
IntoHandlerErrortrait serves only as a marker with no methods. While it provides type safety for error conversions, it adds complexity. Consider using individualFromimplementations instead:impl From<axum::Error> for HandlerError { fn from(error: axum::Error) -> Self { tracing::error!("Axum error: {:?}", error); Self::InternalServer } } impl From<clp_client::ClientError> for HandlerError { fn from(error: clp_client::ClientError) -> Self { tracing::error!("Client error: {:?}", error); Self::InternalServer } }This approach is more explicit and makes it easier to add error-specific handling or logging in the future.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/api-server/src/bin/api_server.rs(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
Applied to files:
components/api-server/src/bin/api_server.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: package-image
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: rust-checks
🔇 Additional comments (1)
components/api-server/src/bin/api_server.rs (1)
1-21: Imports look good.The new imports for CORS, OpenAPI, and utoipa-axum are appropriate for the functionality being added.
junhaoliao
left a comment
There was a problem hiding this comment.
this is great
instead of serving the openapi stats on the API server (i.e., the API server needs to be up and running so developers can read the specs), can we make it accessible in the docs site?
We can:
- create a binary to call the to_yaml() and output to a file during docs build? We can refer to #1654 for a similar setup.
- Then we serve the specs with the swagger UI:
- We can either use https://github.com/swagger-api/swagger-codegen to generate a static site during docs build and place that somewhere in the dist assets
- OR use
sphinxcontrib-openapiwhich should better integrate with the docs site
Then any developers / users who visit our docs site can read the API specs effortlessly
Let me know what you think
There was a problem hiding this comment.
is this / can this be auto-generated?
LinZhihao-723
left a comment
There was a problem hiding this comment.
Partially reviewed. Will get back to u tmr.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
README.md.| "/openapi.json", | ||
| get(|| async { (StatusCode::OK, api_json) }), | ||
| ) | ||
| .layer(CorsLayer::new().allow_origin(Any)); |
There was a problem hiding this comment.
Just curious: Why do we need this?
There was a problem hiding this comment.
Without this, the user cannot put openapi.json url to swagger ui in the browser, because the json url is from a different host.
| .context(format!("Cannot listen to {addr}"))?; | ||
|
|
||
| let client = clp_client::Client::connect(&config, &credentials) | ||
| let client = api_server::client::Client::connect(&config, &credentials) |
There was a problem hiding this comment.
since we name the library crate to api_server, shall we rename client module to clp_client for clarity?
There was a problem hiding this comment.
hmm I still feel we shouldn't mention clp in any place of the code. Let's rename this after we have another client
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
components/api-server/Cargo.toml (1)
32-32: tower-http version 0.6.7 does not exist.This was already flagged in a previous review. The latest published version is 0.6.6.
components/api-server/src/routes.rs (1)
156-172: Factory function placement and CORS configuration.Per past review feedback, consider moving
from_clientto the top of the file since it's the only public symbol.The CORS configuration with
allow_origin(Any)was flagged in past reviews as permissive. While necessary for Swagger UI access from different origins, consider:
- Adding
.allow_methods([Method::GET, Method::POST])to restrict HTTP methods- Documenting that this is intentional for tooling purposes
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (5)
components/api-server/Cargo.toml(2 hunks)components/api-server/src/bin/api_server.rs(1 hunks)components/api-server/src/client.rs(1 hunks)components/api-server/src/lib.rs(1 hunks)components/api-server/src/routes.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
Applied to files:
components/api-server/Cargo.tomlcomponents/api-server/src/routes.rscomponents/api-server/src/bin/api_server.rs
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Applied to files:
components/api-server/Cargo.toml
📚 Learning: 2025-11-10T05:19:56.600Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1575
File: components/clp-py-utils/clp_py_utils/clp_config.py:602-607
Timestamp: 2025-11-10T05:19:56.600Z
Learning: In the y-scope/clp repository, the `ApiServer` class in `components/clp-py-utils/clp_py_utils/clp_config.py` does not need a `transform_for_container()` method because no other containerized service depends on the API server - it's only accessed from the host, so no docker-network communication is expected.
Applied to files:
components/api-server/Cargo.tomlcomponents/api-server/src/bin/api_server.rs
📚 Learning: 2025-08-25T00:45:05.464Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1242
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:39-41
Timestamp: 2025-08-25T00:45:05.464Z
Learning: Task v3.44.1 causes issues in the clp-ffi-js project (issue #110), so CLP should avoid upgrading to v3.44.1 and stick with v3.44.0 for now.
Applied to files:
components/api-server/Cargo.toml
📚 Learning: 2025-10-20T22:20:45.733Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 1448
File: components/clp-rust-utils/Cargo.toml:4-4
Timestamp: 2025-10-20T22:20:45.733Z
Learning: Rust edition 2024 is a valid edition value in Cargo.toml files. It was released with Rust 1.85 on February 20, 2025.
Applied to files:
components/api-server/Cargo.toml
📚 Learning: 2025-07-05T03:38:16.779Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1069
File: docs/src/user-guide/quick-start/clp-json.md:135-152
Timestamp: 2025-07-05T03:38:16.779Z
Learning: The CLP project team has decided to refrain from using include directives in their documentation at present, preferring to maintain duplicated content rather than using shared includes or partials for de-duplication.
Applied to files:
components/api-server/src/bin/api_server.rs
📚 Learning: 2025-09-22T15:46:34.873Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1311
File: integration-tests/tests/test_identity_transformation.py:93-97
Timestamp: 2025-09-22T15:46:34.873Z
Learning: In the CLP project, multi-line formatting is preferred over single-line ternary operators when the single line would exceed the project's line length limits, prioritizing code readability and adherence to coding standards.
Applied to files:
components/api-server/src/bin/api_server.rs
📚 Learning: 2025-09-08T21:41:32.059Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 1291
File: docs/src/user-docs/quick-start/text-v-json.md:84-88
Timestamp: 2025-09-08T21:41:32.059Z
Learning: In the y-scope/clp project documentation, maintain informal contractions like "can't" and "don't" rather than formal alternatives like "cannot" and "do not" - the maintainer prefers a conversational tone.
Applied to files:
components/api-server/src/bin/api_server.rs
📚 Learning: 2025-09-03T07:29:35.223Z
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1107
File: components/core/src/clp_s/CMakeLists.txt:106-116
Timestamp: 2025-09-03T07:29:35.223Z
Learning: In the CLP project, do not suggest adding CONFIG or MODULE specifications to find_package calls. The maintainers prefer using find_package without these mode specifications.
Applied to files:
components/api-server/src/bin/api_server.rs
🧬 Code graph analysis (2)
components/api-server/src/routes.rs (1)
components/api-server/src/client.rs (1)
from(54-65)
components/api-server/src/bin/api_server.rs (2)
components/api-server/src/client.rs (1)
connect(90-104)components/api-server/src/routes.rs (1)
from_client(156-172)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: package-image
- GitHub Check: rust-checks
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (11)
components/api-server/Cargo.toml (2)
7-7: Library rename looks good.Renaming the library from
clp_clienttoapi_serveraligns with the module restructuring and better reflects the crate's purpose.
36-37: OpenAPI dependencies look correct.The utoipa and utoipa-axum versions are appropriate for the OpenAPI integration.
components/api-server/src/lib.rs (1)
1-3: Module restructuring is clean.Making
clientpublic directly (instead of re-exporting withpub use client::*) and adding the newroutesmodule improves the API surface organization. Note that removing the wildcard re-export is a breaking change for any external consumers that relied onapi_server::Clientorapi_server::QueryConfigdirectly—they now need to useapi_server::client::Clientetc.components/api-server/src/client.rs (2)
16-16: ToSchema import supports OpenAPI integration.Adding the
ToSchemaderive enables automatic OpenAPI schema generation forQueryConfig.
21-51: QueryConfig documentation and structure look good.The field-level documentation is comprehensive, and the structure addresses the feedback from previous reviews. The
#[serde(deny_unknown_fields)]attribute provides good input validation.components/api-server/src/bin/api_server.rs (1)
84-88: Clean separation of concerns.The binary now properly delegates routing to the
routesmodule, significantly simplifying this file. This addresses the past review feedback about the file being "too overloaded."components/api-server/src/routes.rs (5)
21-32: OpenAPI metadata is minimal but functional.The
ApiDocstruct provides basic metadata. As noted in past reviews, consider enriching withversionandserversfor better documentation, but this is optional.
34-41: Health endpoint is straightforward.Simple and effective health check implementation.
43-77: Query endpoint looks good.The OpenAPI annotations are comprehensive with examples, and the handler properly delegates to the client.
108-116: Empty message edge case may cause unexpected error.If
message.trim()results in an empty string,lines().count()returns 0, which fails the!= 1check and returnsInternalServererror. This might be the intended behaviour for malformed data, but consider whether empty results should be silently skipped instead:let event_stream = results_stream.map(|res| { let message = res?; let trimmed_message = message.trim(); + if trimmed_message.is_empty() { + return Ok(Event::default().data("")); + } if trimmed_message.lines().count() != 1 { return Err(HandlerError::InternalServer); } Ok(Event::default().data(trimmed_message)) });Verify if empty messages are expected from the results stream and how they should be handled.
157-159: Health endpoint is mounted at both/and/health.Line 158 mounts
healthat/, and line 159 usesroutes!(health)which registers it at/healthper the#[utoipa::path]annotation. This is intentional but worth documenting—clients can use either path.
LinZhihao-723
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(api-server): Document APIs using OpenAPI and serve on `/openapi.json` endpoint.
/openapi.json endpoint.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
components/api-server/src/routes.rs (1)
21-48: Move public function to the top of the file.As noted in previous review, since
from_clientis the only public symbol in this module, it should be positioned at the top for better discoverability and consistency with Rust conventions.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/api-server/src/routes.rs(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
components/api-server/src/routes.rs (1)
components/api-server/src/client.rs (1)
from(54-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: package-image
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
- GitHub Check: rust-checks
| if trimmed_message.lines().count() != 1 { | ||
| return Err(HandlerError::InternalServer); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find and examine the routes.rs file
find . -name "routes.rs" -type f | head -20Repository: y-scope/clp
Length of output: 166
🏁 Script executed:
# Analyzing the optimization suggestion without needing repository files
cat << 'EOF'
# Verification of the optimization suggestion in the review comment
# Original code:
# if trimmed_message.lines().count() != 1 {
# return Err(HandlerError::InternalServer);
# }
# Suggested code:
# if trimmed_message.is_empty() || trimmed_message.contains('\n') {
# return Err(HandlerError::InternalServer);
# }
# Analysis:
# 1. lines().count() vs contains('\n'):
# - lines().count() iterates entire string and counts line breaks + 1
# - contains('\n') stops at first occurrence
# - For single-line messages: lines().count() == 1, contains('\n') == false (both pass)
# - For multi-line messages: lines().count() > 1, contains('\n') == true (both reject)
# - For empty string: lines().count() == 0, contains('\n') == false
#
# 2. Issue with suggested optimization:
# - Original: trimmed_message.lines().count() != 1 (rejects if NOT exactly 1 line)
# - Suggested: trimmed_message.is_empty() || trimmed_message.contains('\n')
# - Empty string case:
# * Original: lines().count() on "" returns 0, so 0 != 1 is true -> rejects
# * Suggested: is_empty() is true -> rejects
# * Both reject, so equivalent here
# - Single line case:
# * Original: "hello".lines().count() == 1, so 1 != 1 is false -> accepts
# * Suggested: is_empty() is false, contains('\n') is false, so false || false = false -> accepts
# * Both accept, so equivalent here
# - Multi-line case:
# * Original: "a\nb".lines().count() == 2, so 2 != 1 is true -> rejects
# * Suggested: is_empty() is false, contains('\n') is true, so false || true = true -> rejects
# * Both reject, so equivalent here
#
# CONCLUSION: The optimization is semantically correct!
# Performance claim is also valid:
# - lines().count() scans entire string
# - contains('\n') stops at first newline
# - For large single-line messages, the optimization is faster
echo "Technical analysis complete"
EOFRepository: y-scope/clp
Length of output: 166
🌐 Web query:
Rust lines().count() performance vs contains() for checking single line
💡 Result:
Short answer: don't use lines().count() to check for a single line — it's slower and allocates more. Read the file (or stream) and search directly.
Why:
- std::io::Lines yields owned Strings (allocations) per line when reading via BufRead::lines(), which adds significant overhead compared with scanning bytes. (reddit.com)
- A single-pass substring/byte search (e.g., str::contains or a memchr/memmem-based search) is linear and much faster in practice than splitting into owned lines. Benchmarks and high‑performance tools use byte-oriented scans (memchr/memmem/optimized libraries) for line/substring work. (github.com)
Practical options (fast → faster):
- If file fits memory: read_to_string and use either:
- s.lines().any(|l| l == target) — avoids allocating new Strings because std::str::lines yields &str slices; or
- s.contains(&format!("{}\n", target)) or a memmem search for exact-line-with-newline (fast but be careful about file end without newline). (reddit.com)
- If streaming large files: read large byte buffers and search with memchr/memmem (or use an optimized crate/tool like ripwc/higher‑performance line counters) to avoid per-line allocations. (reddit.com)
If you want, tell me your exact code or file-size constraints and I’ll give a concrete snippet (BufReader chunked scan, memchr/memmem, or read_to_string + any/contains).
Citations:
- 1: https://www.reddit.com/r/rust/comments/wyk1l0?utm_source=openai
- 2: https://github.com/ashvardanian/memchr_vs_stringzilla?utm_source=openai
- 3: https://www.reddit.com/r/rust/comments/wyk1l0?utm_source=openai
- 4: https://www.reddit.com/r/rust/comments/1kpt3yc?utm_source=openai
Based on my analysis, I can now provide the rewritten review comment:
Optimize single-line validation using contains() instead of lines().count().
The current check uses lines().count() which scans the entire message and allocates owned Strings—unnecessary overhead for this validation. Use contains('\n') instead, which stops at the first newline and avoids allocation.
The suggested optimization is semantically equivalent: it rejects empty strings and multi-line messages the same way while being significantly faster.
- if trimmed_message.lines().count() != 1 {
- return Err(HandlerError::InternalServer);
- }
+ if trimmed_message.is_empty() || trimmed_message.contains('\n') {
+ return Err(HandlerError::InternalServer);
+ }Verify that the backend cannot legitimately return multi-line results (e.g., pretty-printed JSON). If it can, this validation logic may need adjustment or the constraint should be enforced at the source.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if trimmed_message.lines().count() != 1 { | |
| return Err(HandlerError::InternalServer); | |
| } | |
| if trimmed_message.is_empty() || trimmed_message.contains('\n') { | |
| return Err(HandlerError::InternalServer); | |
| } |
🤖 Prompt for AI Agents
In components/api-server/src/routes.rs around lines 140 to 142, the current
single-line validation uses trimmed_message.lines().count(), which iterates and
allocates; replace it with a simpler check that rejects empty messages or any
message containing a newline: use trimmed_message.is_empty() ||
trimmed_message.contains('\n') to return Err(HandlerError::InternalServer). Also
verify whether the backend can legitimately return multi-line output (e.g.,
pretty JSON) and adjust the validation or enforce the constraint upstream if
needed.
| /// Generic errors for request handlers. | ||
| #[derive(Error, Debug)] | ||
| enum HandlerError { | ||
| #[error("Internal server error")] | ||
| InternalServer, | ||
| } | ||
|
|
||
| trait IntoHandlerError {} | ||
|
|
||
| impl IntoHandlerError for axum::Error {} | ||
|
|
||
| impl IntoHandlerError for ClientError {} | ||
|
|
||
| impl<T: IntoHandlerError> From<T> for HandlerError { | ||
| fn from(_: T) -> Self { | ||
| Self::InternalServer | ||
| } | ||
| } | ||
|
|
||
| /// Converts [`HandlerError`] into an HTTP response. | ||
| impl IntoResponse for HandlerError { | ||
| fn into_response(self) -> axum::response::Response { | ||
| StatusCode::INTERNAL_SERVER_ERROR.into_response() | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider preserving error context for better observability.
All errors are currently mapped to a generic InternalServer (500) response, which loses valuable context for debugging and monitoring. While the marker trait pattern works (as noted in previous reviews), consider whether some errors should be differentiated (e.g., 4xx client errors vs 5xx server errors) or at least logged with their original context before conversion.
For example, you could log errors before converting:
impl<T: IntoHandlerError + std::fmt::Display> From<T> for HandlerError {
fn from(err: T) -> Self {
tracing::error!("Handler error: {}", err);
Self::InternalServer
}
}Or add variants to distinguish error types:
enum HandlerError {
#[error("Internal server error")]
InternalServer,
#[error("Client error: {0}")]
Client(String),
}
Description
This PR adds an endpoint to server open api docs.
Move routes to lib.
Checklist
breaking change.
Validation performed
Put http://localhost:3001/openapi.json on https://petstore.swagger.io/. The endpoints are shown correctly.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.