Skip to content

fix(clp-rust-utils): Align database name config with updated schema.#1828

Merged
LinZhihao-723 merged 11 commits into
y-scope:mainfrom
Eden-D-Zhang:rust-config-db-name
Jan 7, 2026
Merged

fix(clp-rust-utils): Align database name config with updated schema.#1828
LinZhihao-723 merged 11 commits into
y-scope:mainfrom
Eden-D-Zhang:rust-config-db-name

Conversation

@Eden-D-Zhang

@Eden-D-Zhang Eden-D-Zhang commented Dec 19, 2025

Copy link
Copy Markdown
Contributor

Description

This PR updates the Database config struct in clp-rust-utils to match the updated names field in its Python counterpart. The ClpDbNameType enum was also added.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Tested using API server by:
    • Started CLP package
    • Compressed a log file
    • Made a query through the API server
    • Retrieved query results

Summary by CodeRabbit

  • Refactor
    • Database configuration now supports multiple named database mappings (separate CLP and Spider entries) for greater flexibility.
    • Services updated to use the new CLP-specific database selection, ensuring consistent database usage across components.

✏️ Tip: You can customize this high-level summary in your review settings.

@Eden-D-Zhang Eden-D-Zhang requested a review from a team as a code owner December 19, 2025 22:00
@coderabbitai

coderabbitai Bot commented Dec 19, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Replaces Database.name: String with a new ClpDbNames struct (fields clp and spider) with a Default impl; updates pool creation to use the CLP DB name and renames the pool factory to create_clp_db_mysql_pool. No other control-flow changes.

Changes

Cohort / File(s) Summary
Database configuration
components/clp-rust-utils/src/clp_config/package/config.rs
Added public ClpDbNames (clp, spider) with #[serde(deny_unknown_fields)] and Default (clp = "clp-db", spider = "spider-db"). Replaced Database.name: String with Database.names: ClpDbNames. Updated Database::default() to use ClpDbNames::default().
MySQL pool creation
components/clp-rust-utils/src/database/mysql.rs
Renamed create_mysql_poolcreate_clp_db_mysql_pool. Doc comment updated to reference CLP DB. Pool now selects config.names.clp instead of config.name.
API server client
components/api-server/src/client.rs
Swapped import/use of create_mysql_pool for create_clp_db_mysql_pool and updated call site and related error/doc text; no other logic changes.
Log ingestor
components/log-ingestor/src/ingestion_job_manager.rs
Replaced call to create_mysql_pool with create_clp_db_mysql_pool, awaiting the result and updating comments/errors to the new name; surrounding signatures unchanged.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check ✅ Passed The title accurately reflects the main change: updating the Database config structure to use a 'names' field instead of 'name' to align with the schema.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
components/clp-rust-utils/src/clp_config/package/config.rs (2)

61-67: Add field-level serde default to prevent empty names map.

With #[serde(default)] on the struct, if the names field is omitted during deserialization, serde uses HashMap::default() (empty map), not the populated map from impl Default for Database. This leads to the panic risk in mysql.rs.

🔎 Proposed fix with field-level default
+fn default_names() -> HashMap<ClpDbNameType, String> {
+    HashMap::from([
+        (ClpDbNameType::Clp, "clp-db".to_owned()),
+        (ClpDbNameType::Spider, "spider-db".to_owned()),
+    ])
+}
+
 #[derive(Clone, Debug, Deserialize, Eq, PartialEq)]
 #[serde(default)]
 pub struct Database {
     pub host: String,
     pub port: u16,
+    #[serde(default = "default_names")]
     pub names: HashMap<ClpDbNameType, String>,
 }

309-311: Consider adding unit tests for Database and ClpDbNameType deserialization.

Given this is a breaking change to the Database struct and introduces a new enum, tests would help verify:

  1. Correct deserialization when names is present with kebab-case keys
  2. Correct default behaviour when names is omitted (once the field-level default is added)
🔎 Example test
#[test]
fn deserialize_database_config() {
    let json = serde_json::json!({
        "host": "db.example.com",
        "port": 3307,
        "names": {
            "clp": "my-clp-db",
            "spider": "my-spider-db"
        }
    });

    let db: super::Database = serde_json::from_value(json).expect("failed to deserialize");
    assert_eq!(db.host, "db.example.com");
    assert_eq!(db.port, 3307);
    assert_eq!(db.names.get(&super::ClpDbNameType::Clp), Some(&"my-clp-db".to_owned()));
    assert_eq!(db.names.get(&super::ClpDbNameType::Spider), Some(&"my-spider-db".to_owned()));
}

#[test]
fn deserialize_database_config_with_defaults() {
    let json = serde_json::json!({
        "host": "db.example.com"
    });

    let db: super::Database = serde_json::from_value(json).expect("failed to deserialize");
    assert_eq!(db.names.get(&super::ClpDbNameType::Clp), Some(&"clp-db".to_owned()));
    assert_eq!(db.names.get(&super::ClpDbNameType::Spider), Some(&"spider-db".to_owned()));
}
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80757cd and 09dae80.

📒 Files selected for processing (2)
  • components/clp-rust-utils/src/clp_config/package/config.rs (3 hunks)
  • components/clp-rust-utils/src/database/mysql.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
  • components/clp-rust-utils/src/database/mysql.rs
📚 Learning: 2025-11-28T15:12:53.530Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1655
File: components/clp-package-utils/clp_package_utils/controller.py:183-189
Timestamp: 2025-11-28T15:12:53.530Z
Learning: In the y-scope/clp project, root database credentials (root username and password) are required configuration items for all deployments. The `credentials[ClpDbUserType.ROOT]` entry is guaranteed to exist and can be accessed directly without optional handling.

Applied to files:

  • components/clp-rust-utils/src/database/mysql.rs
🧬 Code graph analysis (2)
components/clp-rust-utils/src/clp_config/package/config.rs (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
  • ClpDbNameType (211-215)
  • Database (240-405)
components/clp-rust-utils/src/database/mysql.rs (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
  • Database (240-405)
  • ClpDbNameType (211-215)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: check-generated
  • GitHub Check: rust-checks
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (2)
components/clp-rust-utils/src/database/mysql.rs (1)

3-6: LGTM on the import update.

The import correctly brings in ClpDbNameType needed for the HashMap key access.

components/clp-rust-utils/src/clp_config/package/config.rs (1)

46-52: Enum correctly mirrors Python ClpDbNameType.

The kebab-case serde renaming matches the Python KebabCaseStrEnum behaviour, ensuring Clp serializes as "clp" and Spider as "spider" for cross-language compatibility.

Comment thread components/clp-rust-utils/src/database/mysql.rs Outdated
Comment thread components/clp-rust-utils/src/database/mysql.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09dae80 and d17e9bb.

📒 Files selected for processing (1)
  • components/clp-rust-utils/src/clp_config/package/config.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: check-generated
  • GitHub Check: rust-checks
🔇 Additional comments (3)
components/clp-rust-utils/src/clp_config/package/config.rs (3)

1-1: LGTM: Required import for the new names field.

The HashMap import is necessary for the names field in the Database struct.


46-52: Document the Python source and verify enum matches.

The comment referencing ClpDbNameType should include the full Python module path (e.g., clp_py_utils.clp_config.ClpDbNameType) for consistency with other mirrored types. Before merging, verify that the Clp and Spider variants and the kebab-case serde rename match the Python source definition.


66-66: Breaking change properly implemented with HashMap field and key-based access.

The transition from name: String to names: HashMap<ClpDbNameType, String> has been correctly handled. The Database struct includes appropriate default values for both database types, and usages throughout the codebase (such as in mysql.rs) properly access the map using config.names[&ClpDbNameType::Clp].

Comment thread components/clp-rust-utils/src/clp_config/package/config.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
components/clp-rust-utils/src/database/mysql.rs (1)

28-28: Direct HashMap indexing remains unsafe despite field-level default.

While config.rs adds #[serde(default = "default_db_names")] to the names field, this only applies when the field is completely missing from the config. If a user provides names: {} or a partial map (e.g., names: {spider: "db"}), serde will deserialize the provided value without calling the default function, and the direct indexing here will panic.

Replace the direct indexing with safe access and error handling:

🔎 Proposed fix using .get() with error handling
-        .database(&config.names[&ClpDbNameType::Clp])
+        .database(
+            config
+                .names
+                .get(&ClpDbNameType::Clp)
+                .ok_or_else(|| {
+                    crate::Error::from("Missing 'clp' database name in configuration")
+                })?,
+        )

Alternatively, the validation approach suggested in config.rs (lines 66-67) would catch this at config load time, which is preferable to failing during connection setup.

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d8c9be and e11d132.

📒 Files selected for processing (2)
  • components/clp-rust-utils/src/clp_config/package/config.rs (3 hunks)
  • components/clp-rust-utils/src/database/mysql.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
  • components/clp-rust-utils/src/database/mysql.rs
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
📚 Learning: 2025-01-23T17:08:55.566Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 672
File: components/core/src/clp_s/indexer/MySQLIndexStorage.cpp:30-38
Timestamp: 2025-01-23T17:08:55.566Z
Learning: SQL identifiers (table names, column names) in the codebase are currently from trusted sources and directly interpolated into queries. However, the team prefers to implement sanitization for robustness, even for trusted inputs.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
🧬 Code graph analysis (1)
components/clp-rust-utils/src/database/mysql.rs (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
  • ClpDbNameType (211-215)
  • Database (240-405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: check-generated
  • GitHub Check: rust-checks
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (4)
components/clp-rust-utils/src/clp_config/package/config.rs (3)

1-2: LGTM!

The HashMap import is necessary for the new names field and is properly placed.


46-52: LGTM!

The ClpDbNameType enum correctly mirrors the Python implementation with:

  • Appropriate variants (Clp, Spider)
  • Kebab-case serde naming for JSON/YAML serialization
  • Hash trait for HashMap key usage

70-78: LGTM!

The Default implementation correctly uses default_db_names() to initialize the names field, ensuring consistency with the helper function.

components/clp-rust-utils/src/database/mysql.rs (1)

4-4: LGTM!

The import of ClpDbNameType is necessary for accessing the HashMap with the enum key.

Comment on lines +66 to +67
#[serde(default = "default_db_names")]
pub names: HashMap<ClpDbNameType, String>,

@coderabbitai coderabbitai Bot Dec 19, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and examine the config.rs file
find . -name "config.rs" -path "*clp_config/package*" -type f

Repository: y-scope/clp

Length of output: 115


🏁 Script executed:

#!/bin/bash
# Search for the config.rs file in the clp_config package
fd -t f "config.rs" | grep -E "clp_config|package"

# Also search for mysql.rs
fd -t f "mysql.rs"

Repository: y-scope/clp

Length of output: 259


🏁 Script executed:

#!/bin/bash
# Search for ClpDbNameType enum definition
rg -n "enum ClpDbNameType" --type rust

# Search for default_db_names function
rg -n "fn default_db_names" --type rust

# Search for direct indexing on names field
rg -n 'names\s*\[' --type rust -A2 -B2

# Search for mysql.rs line 28
find . -name "mysql.rs" -type f -exec sed -n '20,35p' {} \;

Repository: y-scope/clp

Length of output: 1298


🏁 Script executed:

#!/bin/bash
# First, locate the relevant files
echo "=== Finding config.rs files ==="
find . -name "config.rs" | head -20

echo -e "\n=== Finding mysql.rs files ==="
find . -name "mysql.rs" | head -20

echo -e "\n=== Searching for ClpDbNameType enum ==="
rg "enum ClpDbNameType" -A 10 --type rust

Repository: y-scope/clp

Length of output: 1147


🏁 Script executed:

#!/bin/bash
# Read the config.rs file around the relevant lines
sed -n '45,95p' components/clp-rust-utils/src/clp_config/package/config.rs

# Search for all direct indexing on names field
echo -e "\n=== Searching for direct indexing on names field ==="
rg 'names\s*\[\s*&ClpDbNameType' --type rust -B2 -A2

# Search for Spider key usage
echo -e "\n=== Searching for Spider key usage ==="
rg 'ClpDbNameType::Spider' --type rust -B2 -A2

# Check for .get() patterns on names
echo -e "\n=== Checking for safe .get() patterns ==="
rg 'names\.get\(' --type rust -B2 -A2

Repository: y-scope/clp

Length of output: 2391


🏁 Script executed:

#!/bin/bash
# Read the default_db_names function
sed -n '75,95p' components/clp-rust-utils/src/clp_config/package/config.rs

# Check for any other indexing patterns
rg 'config\.names\[' --type rust -B2 -A2

Repository: y-scope/clp

Length of output: 982


Ajoutez une validation post-désérialisation pour s'assurer que les clés requises sont présentes.

L'attribut #[serde(default = "default_db_names")] au niveau du champ ne s'applique que lorsque le champ names est complètement absent de la configuration. Si un utilisateur fournit une configuration partielle comme names: {} ou names: {spider: "db"}, la fonction par défaut ne sera pas appelée, ce qui laisse le HashMap sans la clé Clp. Cela provoquera une panique à la ligne 28 de mysql.rs où l'indexation directe est utilisée.

Envisagez d'ajouter une validation post-désérialisation pour garantir que toutes les clés requises (Clp et Spider) sont présentes :

impl Database {
    pub fn validate(&self) -> Result<(), String> {
        if !self.names.contains_key(&ClpDbNameType::Clp) {
            return Err("Missing required 'clp' database name in configuration".to_string());
        }
        if !self.names.contains_key(&ClpDbNameType::Spider) {
            return Err("Missing required 'spider' database name in configuration".to_string());
        }
        Ok(())
    }
}

Appelez cette validation après la désérialisation de la configuration et avant de l'utiliser.

🤖 Prompt for AI Agents
In components/clp-rust-utils/src/clp_config/package/config.rs around lines
66-67, the serde default for `names` only applies if the whole field is missing,
so partially provided maps can lack required keys and cause a panic; add a
post-deserialization validation method on the Database struct that checks
presence of both Clp and Spider keys in the names HashMap and returns a clear
Err when missing, and ensure this validate() is invoked immediately after
deserializing the config (before any usage) so callers fail fast with a
descriptive error instead of panicking.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pouvez-vous parler anglais? je ne comprends pas

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@junhaoliao, I apologize for responding in French! Let me provide my feedback in English.

Add post-deserialization validation to ensure required keys are present.

The #[serde(default = "default_db_names")] attribute on the field only applies when the names field is completely absent from the configuration. If a user provides a partial configuration like names: {} or names: {spider: "db"}, the default function won't be called, leaving the HashMap without the Clp key. This will cause a panic at line 28 in mysql.rs where direct indexing is used.

Consider adding post-deserialization validation to ensure all required keys (Clp and Spider) are present:

impl Database {
    pub fn validate(&self) -> Result<(), String> {
        if !self.names.contains_key(&ClpDbNameType::Clp) {
            return Err("Missing required 'clp' database name in configuration".to_string());
        }
        if !self.names.contains_key(&ClpDbNameType::Spider) {
            return Err("Missing required 'spider' database name in configuration".to_string());
        }
        Ok(())
    }
}

Call this validation after deserializing the configuration and before using it.


🧠 Learnings used
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @components/clp-rust-utils/src/clp_config/package/config.rs:
- Around line 44-66: ClpDbNames currently allows empty strings after
deserialization which violates the Python NonEmptyStr constraint; add a
validation method on ClpDbNames (e.g., pub fn validate(&self) -> Result<(),
String>) that returns Err if clp or spider is empty, and call this validate()
immediately after deserializing your configuration (before any DB
construction/usage) so errors are surfaced early; reference the ClpDbNames
struct and ensure callers that parse config propagate or handle the validation
error.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59b5a57 and 190b675.

📒 Files selected for processing (1)
  • components/clp-rust-utils/src/clp_config/package/config.rs
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).
📚 Learning: 2025-11-03T16:17:40.223Z
Learnt from: hoophalab
Repo: y-scope/clp PR: 1535
File: components/clp-rust-utils/src/clp_config/package/config.rs:47-61
Timestamp: 2025-11-03T16:17:40.223Z
Learning: In the y-scope/clp repository, the `ApiServer` struct in `components/clp-rust-utils/src/clp_config/package/config.rs` is a Rust-native configuration type and does not mirror any Python code, unlike other structs in the same file (Config, Database, ResultsCache, Package) which are mirrors of Python definitions.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
📚 Learning: 2025-10-22T21:02:31.113Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
📚 Learning: 2025-10-22T21:01:31.391Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:01:31.391Z
Learning: In y-scope/clp, for the clp-rust-checks workflow (GitHub Actions) and the Taskfile target deps:lock:check-rust, we should verify Rust Cargo.lock is in sync with Cargo.toml using a non-mutating method (e.g., cargo metadata --locked / cargo check --locked) to keep CI deterministic and avoid updating dependencies during validation.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
📚 Learning: 2025-01-23T17:08:55.566Z
Learnt from: gibber9809
Repo: y-scope/clp PR: 672
File: components/core/src/clp_s/indexer/MySQLIndexStorage.cpp:30-38
Timestamp: 2025-01-23T17:08:55.566Z
Learning: SQL identifiers (table names, column names) in the codebase are currently from trusted sources and directly interpolated into queries. However, the team prefers to implement sanitization for robustness, even for trusted inputs.

Applied to files:

  • components/clp-rust-utils/src/clp_config/package/config.rs
🧬 Code graph analysis (1)
components/clp-rust-utils/src/clp_config/package/config.rs (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • Database (240-405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: package-image
  • GitHub Check: lint-check (ubuntu-24.04)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: check-generated
  • GitHub Check: rust-checks
🔇 Additional comments (1)
components/clp-rust-utils/src/clp_config/package/config.rs (1)

80-80: LGTM! Database struct correctly updated.

The replacement of name: String with names: ClpDbNames correctly aligns with the Python clp-py-utils Database model. The updated default implementation properly initializes the new structure.

Also applies to: 88-88

Comment thread components/clp-rust-utils/src/clp_config/package/config.rs
@LinZhihao-723 LinZhihao-723 changed the title fix(clp-rust-utils): Update clp-rust-utils Database config with names field to match clp-py-utils Database model. fix(clp-rust-utils): Update database config to accept CLP and Spider DB names separately. Jan 7, 2026

@junhaoliao junhaoliao left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validated by:

  1. changing the default clp db name in build/clp-package/etc/clp-config.yaml:
    database:
      type: "mariadb"  # "mariadb" or "mysql"
      host: "localhost"
      port: 3306
      names:
        clp: "my-clp-db"
        spider: "my-spider-db"
  2. cd build/clp-package; ./sbin/start-clp.sh
  3. Observed the package was started successfully, including the api server service.
  4. curl -s -X POST http://localhost:3001/query \
        -H "Content-Type: application/json" \
        -d '{
           "query_string": "*",
           "dataset": "default",
           "ignore_case": false,
           "max_num_results": 10,
           "write_to_file": false
        }'
  5. curl -s -N http://localhost:3001/query_results/1 | head -3 and observed results returned
  6. docker compose -p clp-package-500a logs api-server
    api-server-1  | {"timestamp":"2026-01-07T14:33:46.722508Z","level":"INFO","fields":{"message":"Server started at 0.0.0.0:3001"},"filename":"components/api-server/src/bin/api_server.rs","line_number":101}
    api-server-1  | {"timestamp":"2026-01-07T14:37:57.195474Z","level":"INFO","fields":{"message":"Submitting query: QueryConfig { query_string: \"*\", dataset: Some(\"default\"), max_num_results: 10, begin_timestamp: None, end_timestamp: None, ignore_case: false, write_to_file: false }"},"filename":"components/api-server/src/routes.rs","line_number":108}
    api-server-1  | {"timestamp":"2026-01-07T14:37:57.209891Z","level":"INFO","fields":{"message":"Submitted query with search job ID: 1"},"filename":"components/api-server/src/routes.rs","line_number":111}
    api-server-1  | {"timestamp":"2026-01-07T14:38:13.884597Z","level":"INFO","fields":{"message":"Fetching results for search job ID: 1"},"filename":"components/api-server/src/routes.rs","line_number":153}
    api-server-1  | {"timestamp":"2026-01-07T14:38:13.886642Z","level":"INFO","fields":{"message":"Successfully initiated result stream for search job ID 1"},"filename":"components/api-server/src/routes.rs","line_number":156}

for the title, how about:

fix(clp-rust-utils): Align database name config with updated schema.

@LinZhihao-723 LinZhihao-723 changed the title fix(clp-rust-utils): Update database config to accept CLP and Spider DB names separately. fix(clp-rust-utils): Align database name config with updated schema. Jan 7, 2026
@LinZhihao-723 LinZhihao-723 merged commit c932131 into y-scope:main Jan 7, 2026
22 checks passed
@junhaoliao

Copy link
Copy Markdown
Member

sorry i think we should update the Helm chart too; otherwise the api server deployments would fail to start:

kubectl logs test-clp-api-server-856d4b5c77-6gq8t
Defaulted container "api-server" out of: api-server, wait-for-db-table-creator (init), wait-for-results-cache-indices-creator (init)
Error: Config file /etc/clp-config.yaml does not exist

Caused by:
    0: `serde_yaml::Error`: database.names: missing field `spider` at line 35 column 5
    1: database.names: missing field `spider` at line 35 column 5

@coderabbitai create an issue to track

@coderabbitai

coderabbitai Bot commented Jan 7, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find Helm chart configuration files that might contain database config
fd -e yaml -e yml | grep -i helm | head -20

Length of output: 1355


🏁 Script executed:

#!/bin/bash
# Search for database name configuration in Helm charts
rg -n "database" --type yaml -g '*helm*' -g '*chart*' -C 3

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Look for values.yaml or similar config files in charts directories
fd values.yaml

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Check if there's a charts or helm directory
fd -t d -d 3 | grep -E "(chart|helm)" | head -10

Length of output: 30


🏁 Script executed:

#!/bin/bash
# Check the values.yaml for database configuration
cat tools/deployment/package-helm/values.yaml

Length of output: 4617


🏁 Script executed:

#!/bin/bash
# Check the configmap.yaml for database configuration
cat tools/deployment/package-helm/templates/configmap.yaml

Length of output: 8364


🏁 Script executed:

#!/bin/bash
# Create a GitHub issue to track the Helm chart update
gh issue create \
  --title "Update Helm chart database config to include spider database name" \
  --body "## Description

The Helm chart configuration needs to be updated to match the new database config structure introduced in #1828.

## Problem

In #1828, the \`Database\` config struct was updated to change from \`name: String\` to \`names: ClpDbNames { clp, spider }\`. However, the Helm chart configuration files were not updated accordingly, causing API server deployments to fail with:

\`\`\`
Error: Config file /etc/clp-config.yaml does not exist

Caused by:
    0: \`serde_yaml::Error\`: database.names: missing field \`spider\` at line 35 column 5
    1: database.names: missing field \`spider\` at line 35 column 5
\`\`\`

## Files to Update

1. \`tools/deployment/package-helm/values.yaml\` - Add \`spider\` field under \`clpConfig.database.names\`
2. \`tools/deployment/package-helm/templates/configmap.yaml\` - Add \`spider\` field in the database config template

## Current Configuration

\`\`\`yaml
database:
  names:
    clp: \"clp-db\"
\`\`\`

## Expected Configuration

\`\`\`yaml
database:
  names:
    clp: \"clp-db\"
    spider: \"spider-db\"
\`\`\`

## Related

- PR: https://github.com/y-scope/clp/pull/1828
- Comment: https://github.com/y-scope/clp/pull/1828#discussion_r1910127849
- Reported by: @junhaoliao" \
  --assignee "junhaoliao" \
  --repo "y-scope/clp"

Length of output: 42


@junhaoliao I've created issue #1845 to track the Helm chart update: #1845

The issue documents the need to add the spider field under database.names in both tools/deployment/package-helm/values.yaml and tools/deployment/package-helm/templates/configmap.yaml to prevent the deployment failures you encountered.


🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 0
File: :0-0
Timestamp: 2025-10-22T21:02:31.113Z
Learning: Repository y-scope/clp: Maintain deterministic CI/builds for Rust; add a check to verify Cargo.lock is in sync with Cargo.toml without updating dependencies (non-mutating verification in clp-rust-checks workflow).

davidlion pushed a commit to davidlion/clp that referenced this pull request Jan 17, 2026
…-scope#1828)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: LinZhihao-723 <zh.lin@mail.utoronto.ca>
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…-scope#1828)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: LinZhihao-723 <zh.lin@mail.utoronto.ca>
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…-scope#1828)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: LinZhihao-723 <zh.lin@mail.utoronto.ca>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants