Skip to content

feat(log-ingestor): Add HTTP server skeleton:#1715

Merged
hoophalab merged 5 commits into
y-scope:mainfrom
hoophalab:skeleton
Dec 2, 2025
Merged

feat(log-ingestor): Add HTTP server skeleton:#1715
hoophalab merged 5 commits into
y-scope:mainfrom
hoophalab:skeleton

Conversation

@hoophalab

@hoophalab hoophalab commented Dec 2, 2025

Copy link
Copy Markdown
Contributor
  • Load CLP package config and credentials.
  • Configure server-wise logger.
  • Add health request handling for health check service.

Description

This PR adds the HTTP server skeleton and boilerplate for loading configs in log ingestor

This PR was originally #1675 and it has been reviewed by @LinZhihao-723 in that PR.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. run log-ingestor with
CLP_LOGS_DIR=. CLP_DB_USER=clp-user CLP_DB_PASS=pass ./build/rust/release/log-ingestor --config clp-config.yaml --host localhost --port 8080
  1. GET http://localhost:8080/ returns Log ingestor is running

Summary by CodeRabbit

  • New Features

    • Added HTTP server with health endpoints.
    • Implemented structured JSON logging with rolling file output.
    • Added command-line argument parsing.
    • Added graceful shutdown handling for termination signals.
  • Chores

    • Added and updated runtime and logging dependencies, including enhanced async runtime features for signal handling.

✏️ Tip: You can customize this high-level summary in your review settings.

@hoophalab hoophalab requested a review from a team as a code owner December 2, 2025 19:13
@coderabbitai

coderabbitai Bot commented Dec 2, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Replaces a placeholder binary with an async Axum service: adds CLI parsing, YAML config loading, environment-backed DB credentials, structured JSON rolling-file logging, signal-based graceful shutdown, and health-check HTTP endpoints.

Changes

Cohort / File(s) Summary
Dependency additions & updates
components/log-ingestor/Cargo.toml
Added axum (0.8.6, feature: json), clap (4.5.51, feature: derive), tracing (0.1.41), tracing-appender (0.2.2), tracing-subscriber (0.3.20, features: json, env-filter, fmt, std); extended tokio features to include signal.
Application main
components/log-ingestor/src/main.rs
Rewrote main.rs to an async entrypoint: CLI args parsing, YAML config + env credential loading, tracing subscriber setup with rolling hourly JSON file appender, SIGTERM/Ctrl-C shutdown handling, TCP listener binding, Axum router with / and /health handlers, and graceful shutdown flow.

Sequence Diagram

sequenceDiagram
    autonumber
    actor User
    participant App as log-ingestor
    participant Config as Config Loader
    participant Logger as Tracing Subscriber
    participant Server as Axum HTTP Server
    participant Signal as Signal Handler

    User->>App: start (CLI args)
    App->>Config: read_config_and_credentials (YAML + env)
    Config-->>App: config & creds

    App->>Logger: set_up_logging (rolling JSON files)
    Logger-->>App: subscriber initialized

    App->>Server: build Router (health endpoints)
    App->>Server: bind TCP listener
    Server-->>App: listening on host:port

    User->>Server: GET /health
    Server-->>User: 200 OK

    Signal->>App: SIGTERM / Ctrl-C
    App->>Server: initiate graceful shutdown
    Server-->>App: shutdown complete
    App->>Logger: log shutdown
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Pay special attention to:
    • components/log-ingestor/src/main.rs — signal handling, graceful shutdown correctness, and error contexts
    • components/log-ingestor/Cargo.toml — tokio feature change and tracing/tracing-appender configuration compatibility
    • Logging initialization — rolling appender configuration and non-blocking JSON formatting setup

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(log-ingestor): Add HTTP server skeleton' clearly and specifically describes the main change: adding an HTTP server foundation to the log-ingestor component.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 773483e and 9c362d8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (2)
  • components/log-ingestor/Cargo.toml (1 hunks)
  • components/log-ingestor/src/main.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/log-ingestor/src/main.rs
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • components/log-ingestor/Cargo.toml
🔇 Additional comments (7)
components/log-ingestor/Cargo.toml (2)

6-20: Verify that Cargo.lock is tracked in git and the clp-rust-checks workflow validates it without mutations.

Per repository policy for deterministic Rust builds, ensure Cargo.lock remains in sync with Cargo.toml and is verified by the clp-rust-checks workflow using non-mutating validation (e.g., cargo metadata --locked or cargo check --locked).


8-20: Dependency choices are appropriate for an HTTP server with structured logging.

The added dependencies align well with the implementation:

  • axum 0.8.6 with json feature: standard Rust web framework
  • clap 4.5.51 with derive feature: idiomatic CLI argument parsing
  • tokio 1.48.0: supports all requested features (macros, rt-multi-thread, signal, time) without compatibility issues
  • tracing ecosystem: standard structured logging with JSON and filter support
components/log-ingestor/src/main.rs (5)

8-28: LGTM!

Good use of secrecy::SecretString for password handling and proper error context. The function correctly loads configuration and constructs credentials from environment variables.


53-64: LGTM!

Clean argument definition using clap. Required arguments without defaults are appropriate for a server that needs explicit configuration.


66-75: Unix-only signal handling.

tokio::signal::unix::signal is not available on Windows. If cross-platform support is required in the future, consider conditional compilation or using a cross-platform signal handling approach.

Is cross-platform (Windows) support a requirement for this component? If so, consider using #[cfg(unix)] conditionals or alternative approaches.


77-98: LGTM!

Well-structured main function with proper error handling, graceful shutdown, and logging. The underscore-prefixed variables appropriately indicate the skeleton nature of this implementation.


100-102: LGTM!

Simple and efficient health check endpoint.

Comment thread components/log-ingestor/src/main.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
components/log-ingestor/src/main.rs (1)

81-82: Initialise logging before loading config/credentials

Right now, config and credential loading happens before set_up_logging, so any failures there won’t be captured by the structured logger, only by stderr. Consider initialising logging first (or using a simple early logger and then upgrading once config is loaded) so start‑up issues are consistently logged.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c362d8 and e428257.

📒 Files selected for processing (1)
  • components/log-ingestor/src/main.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/log-ingestor/src/main.rs
🧬 Code graph analysis (1)
components/log-ingestor/src/main.rs (1)
components/clp-rust-utils/src/serde/yaml.rs (1)
  • from_path (18-21)
🔇 Additional comments (3)
components/log-ingestor/src/main.rs (3)

21-41: Config and credential loading is cohesive and fail-fast

Bundling YAML config loading with env-based DB credentials in a single helper keeps main lean and produces good contextual error messages. Using explicit CLP_DB_USER/CLP_DB_PASS requirements is clear and fail-fast; no changes needed here at this stage.


89-96: Server bootstrap and graceful shutdown wiring looks good

Router setup with "/" and "/health" sharing the same handler, along with axum::serve plus .with_graceful_shutdown(shutdown_signal()), gives a clean and minimal health service skeleton. Logging the bound address before serving is also helpful for ops.


43-64: Logging setup is solid; consider handling missing/invalid log directory more gently

The non-blocking file logging plus JSON formatting and env-based filtering is well put together, and returning the guard from set_up_logging correctly keeps the writer alive. One possible improvement is to ensure CLP_LOGS_DIR exists (e.g., create_dir_all) or to surface a more explicit hint if the directory path is invalid, to aid ops debugging.

⛔ Skipped due to learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/general.py:0-0
Timestamp: 2025-08-19T14:41:28.901Z
Learning: In the CLP codebase, prefer explicit failures over automatic directory creation in utility functions like dump_config. The user junhaoliao prefers to let file operations fail when parent directories don't exist, as this helps catch implementation errors during development rather than masking setup issues with automatic directory creation.
Learnt from: haiqi96
Repo: y-scope/clp PR: 1100
File: integration-tests/tests/test_identity_transformation.py:87-95
Timestamp: 2025-08-08T21:15:10.905Z
Learning: In the CLP project's integration tests (Python code), variable names should use "logs" (plural) rather than "log" (singular) when referring to test logs or log-related entities, as this aligns with the naming conventions used throughout the codebase.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.

Comment thread components/log-ingestor/src/main.rs
Comment thread components/log-ingestor/src/main.rs
Comment thread components/log-ingestor/src/main.rs
Comment thread components/log-ingestor/src/main.rs

@LinZhihao-723 LinZhihao-723 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved on behave of #1675.
For the PR title, how about:

feat(log-ingestor): Add HTTP server skeleton:
* Load CLP package config and credentials.
* Configure server-wise logger.
* Add `health` request handling for health check service.

@hoophalab hoophalab changed the title feat(log-ingestor): Add HTTP server skeleton and config setup. feat(log-ingestor): Add HTTP server skeleton: Dec 2, 2025
@hoophalab hoophalab merged commit 4e86b87 into y-scope:main Dec 2, 2025
22 of 23 checks passed
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
* Load CLP package config and credentials.
* Configure server-wise logger.
* Add `health` request handling for health check service.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants