refactor(helm)!: Remove all host-path volume mounts.#2023
Conversation
- Log to stdout/stderr instead of host-path log PVs. - Use dynamic provisioning for persistent data (database, results-cache, shared-data). - Use emptyDir for ephemeral/temp volumes (Redis, staging, worker tmp). - Delete static PV/PVC templates and the local-storage StorageClass.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughCentralizes logging into new clp-rust-utils logging APIs, removes many per-component file log handlers and CLP_HOME filesystem conventions, converts component logs to stdout/emptyDir, removes numerous static PV/PVC and pod securityContext entries from Helm charts, and updates Python/Rust services to use the centralized logging configurator. Changes
Sequence Diagram(s)sequenceDiagram
participant Binary
participant clp_rust_utils as clp_rust_utils::logging
participant Env as Environment (CLP_LOGS_DIR)
participant FS as Filesystem / Stdout
Binary->>clp_rust_utils: set_up_logging("component.log")
clp_rust_utils->>Env: read CLP_LOGS_DIR
alt CLP_LOGS_DIR set
clp_rust_utils->>FS: create RollingFileAppender at CLP_LOGS_DIR/component.log
clp_rust_utils->>FS: create non-blocking writer + JSON formatter
clp_rust_utils-->>Binary: return Some(WorkerGuard)
Binary->>FS: logs -> file + stdout (JSON)
else CLP_LOGS_DIR not set
clp_rust_utils->>FS: configure stdout JSON logging only
clp_rust_utils-->>Binary: return None
Binary->>FS: logs -> stdout (JSON)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…ify configuration
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (8)
tools/deployment/package-helm/templates/compression-worker-deployment.yaml (1)
41-52:⚠️ Potential issue | 🟡 MinorChange
CLP_LOGS_DIRfrom/tmpto/var/tmpto match the emptyDir mount.The compression-worker writes stderr logs to files under
CLP_LOGS_DIR(see compression_task.py line 424), but line 41 sets it to/tmp(container ephemeral layer) while line 52 mounts an emptyDir at/var/tmp. This mismatch causes logs to be lost on container restart. Align the environment variable with the mounted volume:- - name: "CLP_LOGS_DIR" - value: "/tmp" + - name: "CLP_LOGS_DIR" + value: "/var/tmp"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml` around lines 41 - 52, Set the CLP_LOGS_DIR env var to the mounted emptyDir path so logs persist: change the CLP_LOGS_DIR value from "/tmp" to "/var/tmp" in the deployment template to match the volumeMount mountPath "/var/tmp" (the volume defined via include "clp.volumeName" for component_category "compression-worker"). Ensure this aligns with how compression_task.py writes stderr logs (referenced via CLP_LOGS_DIR).tools/deployment/package-helm/templates/shared-data-streams-pvc.yaml (1)
1-8:⚠️ Potential issue | 🟠 MajorRWO blocks multi-node filesystem storage for streams.
Line 7 restricts the streams PVC to
ReadWriteOnce, which prevents multiple worker pods on different nodes from mounting the same shared filesystem. That breaks the multi-node filesystem storage path described in the docs. Consider keepingReadWriteManyor making access modes configurable.🔧 Suggested fix
- "accessModes" (list "ReadWriteOnce") + "accessModes" (list "ReadWriteMany")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/shared-data-streams-pvc.yaml` around lines 1 - 8, The PVC template currently hardcodes accessModes to ReadWriteOnce in the clp.createPvc invocation for the "streams" shared-data component, which prevents multi-node mounts; update the invocation in templates/shared-data-streams-pvc.yaml to use ReadWriteMany (or expose .Values.clpConfig.stream_output.storage.accessModes and default to ReadWriteMany) instead of the fixed (list "ReadWriteOnce") so multiple worker pods on different nodes can mount the same filesystem; ensure the change is applied to the clp.createPvc call that supplies "component_category" "shared-data" and "name" "streams".tools/deployment/package-helm/templates/shared-data-archives-pvc.yaml (1)
1-8:⚠️ Potential issue | 🟠 MajorRWO prevents archives PVC from being shared across workers.
Line 7 changes archives to
ReadWriteOnce, which blocks multi-node access to the shared archives volume. For filesystem-based multi-node deployments this needs RWX (or an explicit RWX-capable StorageClass).🔧 Suggested fix
- "accessModes" (list "ReadWriteOnce") + "accessModes" (list "ReadWriteMany")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/shared-data-archives-pvc.yaml` around lines 1 - 8, The PVC for archives is being created with accessModes set to ReadWriteOnce via the clp.createPvc invocation, which prevents multi-node mounts; update the call in shared-data-archives-pvc.yaml (the include "clp.createPvc" invocation for component_category "shared-data" and name "archives") so the accessModes list uses ReadWriteMany (or reads the value from a new configurable .Values.clpConfig.archive_output.storage.accessMode) to allow RWX mounts on multi-node filesystem deployments; ensure the change preserves existing capacity and name parameters and falls back to ReadWriteOnce if RWX is not requested.tools/deployment/package-helm/templates/_helpers.tpl (1)
113-138:⚠️ Potential issue | 🟠 MajorPVC helper should still allow explicit StorageClass selection.
With storageClassName/selector removed, users can’t target an RWX-capable StorageClass or clusters without a default StorageClass. This can leave PVCs Pending and blocks multi-node filesystem storage. Please keep an optional override even if the default is preferred.
🔧 Suggested fix (optional StorageClass override)
spec: accessModes: {{ .accessModes }} + {{- with .storageClassName }} + storageClassName: {{ . | quote }} + {{- end }} resources: requests: storage: {{ .capacity }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/_helpers.tpl` around lines 113 - 138, The PVC helper clp.createPvc removed storageClassName/selector causing PVCs to hang when no default or when RWX is required; re-add an optional override in the template so callers can pass a storage class or selector (e.g., check for .storageClassName and/or .selector on the template context and emit a storageClassName: <value> line or selector: <map> block when present). Update the clp.createPvc definition to conditionally render storageClassName (using .storageClassName | quote) and/or selector (render the map from .selector or .storageClassSelector under spec) while leaving behavior unchanged when those keys are not provided; keep references to clp.fullname and clp.volumeName unchanged.docs/src/dev-docs/design-deployment-orchestration.md (1)
242-248:⚠️ Potential issue | 🟡 MinorUpdate the storage intro to reflect stdout logging.
The lead sentence still says services require persistent storage for logs, but the Kubernetes bullet now states logs are emitted to stdout/stderr. Please drop “logs” from the lead sentence to avoid mixed signals.
✍️ Suggested fix
-Services require persistent storage for logs, data, archives, and streams. +Services require persistent storage for data, archives, and streams.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/dev-docs/design-deployment-orchestration.md` around lines 242 - 248, The lead sentence currently reads "Services require persistent storage for logs, data, archives, and streams." which conflicts with the Kubernetes note about emitting logs to stdout/stderr; update that sentence to remove "logs" so it reads something like "Services require persistent storage for data, archives, and streams." Modify the opening line in the same paragraph (the one before the "Docker Compose" and "Kubernetes" bullets) to remove the word "logs" and ensure the rest of the bullets remain unchanged so the Kubernetes note about stdout/stderr stays accurate.docs/src/user-docs/guides-k8s-deployment.md (1)
128-145:⚠️ Potential issue | 🟡 MinorClarify RWX requirement and configuration for shared storage.
This section states that workers must share filesystem paths, but it doesn’t mention that the underlying PVCs must support
ReadWriteMany(or be backed by an RWX-capable StorageClass). Without that, multi-node mounts will fail. Please add a short note about RWX and how to select the appropriate StorageClass in Helm values.✍️ Suggested doc tweak
- For multi-node clusters using filesystem storage, the following directories **must** be + For multi-node clusters using filesystem storage, the following directories **must** be accessible from all worker nodes at the same paths. Without shared storage, compressed logs created by one worker cannot be searched by other workers. + Ensure the StorageClass backing the archives/streams PVCs supports ReadWriteMany (RWX) and is + selected in your Helm values if your cluster’s default StorageClass is not RWX-capable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 128 - 145, Update the "Shared storage for workers" section to explicitly state that the underlying PVCs must support ReadWriteMany (RWX) for multi-node mounts to work, and add a short note pointing users to pick an RWX-capable StorageClass in their Helm values; reference the existing directory keys (archive_output.storage.directory, stream_output.storage.directory, logs_input.directory) and say that the corresponding PVC/storageClass used by the chart must be RWX (and indicate the Helm values field where to set the storageClass used by persistent volumes so operators know to choose an RWX-backed StorageClass).components/log-ingestor/src/bin/log_ingestor.rs (1)
46-82: 🧹 Nitpick | 🔵 TrivialExtract
set_up_loggingintoclp-rust-utilsto eliminate cross-component duplication.This function is virtually identical to
set_up_loggingincomponents/api-server/src/bin/api_server.rs(lines 43–79); the only difference is the log filename ("log_ingestor.log"vs"api_server.log"). A shared helper inclp-rust-utilsthat accepts the service name would prevent this duplication across multiple binary crates:♻️ Sketch of a shared helper
// e.g. in clp-rust-utils/src/logging.rs pub fn init_logging( service_name: &str, ) -> anyhow::Result<Option<tracing_appender::non_blocking::WorkerGuard>> { let log_file_name = format!("{service_name}.log"); if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR") { let file_appender = RollingFileAppender::new( Rotation::HOURLY, std::path::Path::new(logs_directory.as_str()), &log_file_name, ); let (non_blocking_writer, guard) = tracing_appender::non_blocking(file_appender); init_subscriber(std::io::stdout.and(non_blocking_writer)); return Ok(Some(guard)); } init_subscriber(std::io::stdout); Ok(None) }Additionally, the subscriber builder chain (event format, env filter, ansi) is repeated in both the file-logging and stdout-only branches within this function. Extracting a small helper that accepts a writer would collapse that internal duplication as well.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/log-ingestor/src/bin/log_ingestor.rs` around lines 46 - 82, Extract the duplicated set_up_logging into clp-rust-utils as a public function (e.g., logging::init_logging(service_name: &str) -> anyhow::Result<Option<tracing_appender::non_blocking::WorkerGuard>>) that builds the log filename from service_name, creates the RollingFileAppender and tracing_appender::non_blocking writer when CLP_LOGS_DIR is set, and returns the guard; inside clp-rust-utils also add a small helper (e.g., init_subscriber<W: MakeWriter + 'static>(writer: W)) that configures the subscriber chain (event_format with level/target/file/line/json, EnvFilter::from_default_env, with_ansi(false)) and calls .with_writer(writer).init(); replace the local set_up_logging functions in log_ingestor (and api_server) to call logging::init_logging("log_ingestor") (or "api_server") so the file-vs-stdout branching and subscriber setup are centralized.tools/deployment/package-helm/templates/queue-statefulset.yaml (1)
1-49:⚠️ Potential issue | 🟠 MajorConfirm whether RabbitMQ data should persist across pod restarts.
The StatefulSet has no
volumeClaimTemplatesand no explicit volume mount for/var/lib/rabbitmq. The queue will useemptyDirsemantics, losing all broker state on pod restart or rescheduling.The design documentation lists persistent storage for database, results-cache, archives, and streams, and explicitly mentions
emptyDirfor Redis and staging directories—but does not list the queue as either persistent or ephemeral. This is undocumented.If ephemeral queues are intentional (workers re-declare queues on connect and tolerate in-flight message loss), this must be explicitly documented. Otherwise, add a
volumeClaimTemplatessection matching the pattern used indatabase-statefulset.yamlandresults-cache-statefulset.yaml.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/queue-statefulset.yaml` around lines 1 - 49, StatefulSet "queue" currently has no volumeClaimTemplates or volumeMount for /var/lib/rabbitmq, causing RabbitMQ data to be ephemeral; decide whether persistence is required and if so add a volumeClaimTemplates block and a corresponding volumeMount in the "queue" container to mount /var/lib/rabbitmq (follow the same PVC pattern used by database-statefulset and results-cache-statefulset), otherwise update documentation to explicitly state the queue is ephemeral and explain broker restart/ message-loss behavior; adjust the StatefulSet spec (replicas/selector/template) around the container named "queue" and include the PVC name used by the volumeClaimTemplates so the mount references the new claim.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/job-orchestration/job_orchestration/reducer/reducer.py`:
- Around line 71-90: The loop that spawns reducers (building
reducer_instance_cmd, opening log_file and calling subprocess.Popen) must be
wrapped in a try/finally so any exception during open() or Popen() still closes
resources: in the finally iterate reducer_log_files and call .close() for each
opened file and remove/close the optional logging_file_handler, and also
terminate/cleanup any partially-started processes recorded in reducers; update
code around reducer_log_files, reducers, reducer_instance_cmd and
subprocess.Popen to ensure files and handlers are closed regardless of
exceptions.
In
`@components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py`:
- Around line 426-432: The optional file logging code in
compression_scheduler.py should guard against missing/unwritable CLP_LOGS_DIR:
before creating logging.FileHandler for log_file, check if
os.path.isdir(logs_dir) and attempt to create it (os.makedirs(logs_dir,
exist_ok=True)) and validate write permissions; wrap FileHandler creation in a
try/except to catch OSError/PermissionError and on failure skip adding the
FileHandler (or fallback to console) while logging the failure via the existing
logger (created by get_logger()). Ensure references to CLP_LOGS_DIR, log_file,
logging.FileHandler, and logger in compression_scheduler.py are updated
accordingly.
In `@tools/deployment/package-helm/.set-up-common.sh`:
- Around line 10-13: Update the JSDoc-style comment for the
create_clp_directories() function to reflect its narrowed behavior: it now only
creates the "samples" subdirectory under CLP_HOME rather than all "CLP data".
Edit the comment to briefly state that the function creates the CLP samples
directory (e.g., "Creates the CLP samples directory at ${CLP_HOME}/samples") and
remove or replace the misleading "CLP data" phrasing so the comment matches the
mkdir -p "$CLP_HOME/samples" implementation.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml`:
- Around line 106-110: The emptyDir for the staged-streams volume currently has
no sizeLimit; update the Helm template that defines the volume (the include
"clp.volumeName" with dict
"component_category":"query-worker","name":"staged-streams") to add a sizeLimit
to the emptyDir (e.g., "5Gi") or wire it to a values key so operators can
configure it, ensuring the emptyDir: block includes sizeLimit to prevent
unbounded node disk usage.
In `@tools/deployment/package-helm/templates/redis-statefulset.yaml`:
- Around line 63-70: The templated volume block triggers YAMLlint brace-spacing
errors; fix by normalizing Helm template whitespace inside the mustaches for the
volume name and configMap name: collapse the multiline dict into a single
expression and ensure single spaces immediately after '{{' and before '}}' for
the include calls (references: include "clp.volumeName" and include
"clp.fullname"), so the volume name and configMap name lines render with
properly spaced braces and no extra line breaks.
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`:
- Around line 65-68: The templates currently set resources.requests.storage to a
smaller hard-coded value which will cause Helm upgrades to fail when an existing
PVC is larger; change the template so volumeClaimTemplates (or the field where
resources.requests.storage is defined) uses a configurable value (e.g.,
.Values.persistence.size) and fall back to the existing PVC size via the Helm
lookup function if present (use lookup to read the current PersistentVolumeClaim
and use its spec.resources.requests.storage), or keep the previous default size
(e.g., 20Gi) to avoid shrinking; update the template to reference that value
instead of the hard-coded "10Gi".
---
Outside diff comments:
In `@components/log-ingestor/src/bin/log_ingestor.rs`:
- Around line 46-82: Extract the duplicated set_up_logging into clp-rust-utils
as a public function (e.g., logging::init_logging(service_name: &str) ->
anyhow::Result<Option<tracing_appender::non_blocking::WorkerGuard>>) that builds
the log filename from service_name, creates the RollingFileAppender and
tracing_appender::non_blocking writer when CLP_LOGS_DIR is set, and returns the
guard; inside clp-rust-utils also add a small helper (e.g., init_subscriber<W:
MakeWriter + 'static>(writer: W)) that configures the subscriber chain
(event_format with level/target/file/line/json, EnvFilter::from_default_env,
with_ansi(false)) and calls .with_writer(writer).init(); replace the local
set_up_logging functions in log_ingestor (and api_server) to call
logging::init_logging("log_ingestor") (or "api_server") so the file-vs-stdout
branching and subscriber setup are centralized.
In `@docs/src/dev-docs/design-deployment-orchestration.md`:
- Around line 242-248: The lead sentence currently reads "Services require
persistent storage for logs, data, archives, and streams." which conflicts with
the Kubernetes note about emitting logs to stdout/stderr; update that sentence
to remove "logs" so it reads something like "Services require persistent storage
for data, archives, and streams." Modify the opening line in the same paragraph
(the one before the "Docker Compose" and "Kubernetes" bullets) to remove the
word "logs" and ensure the rest of the bullets remain unchanged so the
Kubernetes note about stdout/stderr stays accurate.
In `@docs/src/user-docs/guides-k8s-deployment.md`:
- Around line 128-145: Update the "Shared storage for workers" section to
explicitly state that the underlying PVCs must support ReadWriteMany (RWX) for
multi-node mounts to work, and add a short note pointing users to pick an
RWX-capable StorageClass in their Helm values; reference the existing directory
keys (archive_output.storage.directory, stream_output.storage.directory,
logs_input.directory) and say that the corresponding PVC/storageClass used by
the chart must be RWX (and indicate the Helm values field where to set the
storageClass used by persistent volumes so operators know to choose an
RWX-backed StorageClass).
In `@tools/deployment/package-helm/templates/_helpers.tpl`:
- Around line 113-138: The PVC helper clp.createPvc removed
storageClassName/selector causing PVCs to hang when no default or when RWX is
required; re-add an optional override in the template so callers can pass a
storage class or selector (e.g., check for .storageClassName and/or .selector on
the template context and emit a storageClassName: <value> line or selector:
<map> block when present). Update the clp.createPvc definition to conditionally
render storageClassName (using .storageClassName | quote) and/or selector
(render the map from .selector or .storageClassSelector under spec) while
leaving behavior unchanged when those keys are not provided; keep references to
clp.fullname and clp.volumeName unchanged.
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`:
- Around line 41-52: Set the CLP_LOGS_DIR env var to the mounted emptyDir path
so logs persist: change the CLP_LOGS_DIR value from "/tmp" to "/var/tmp" in the
deployment template to match the volumeMount mountPath "/var/tmp" (the volume
defined via include "clp.volumeName" for component_category
"compression-worker"). Ensure this aligns with how compression_task.py writes
stderr logs (referenced via CLP_LOGS_DIR).
In `@tools/deployment/package-helm/templates/queue-statefulset.yaml`:
- Around line 1-49: StatefulSet "queue" currently has no volumeClaimTemplates or
volumeMount for /var/lib/rabbitmq, causing RabbitMQ data to be ephemeral; decide
whether persistence is required and if so add a volumeClaimTemplates block and a
corresponding volumeMount in the "queue" container to mount /var/lib/rabbitmq
(follow the same PVC pattern used by database-statefulset and
results-cache-statefulset), otherwise update documentation to explicitly state
the queue is ephemeral and explain broker restart/ message-loss behavior; adjust
the StatefulSet spec (replicas/selector/template) around the container named
"queue" and include the PVC name used by the volumeClaimTemplates so the mount
references the new claim.
In `@tools/deployment/package-helm/templates/shared-data-archives-pvc.yaml`:
- Around line 1-8: The PVC for archives is being created with accessModes set to
ReadWriteOnce via the clp.createPvc invocation, which prevents multi-node
mounts; update the call in shared-data-archives-pvc.yaml (the include
"clp.createPvc" invocation for component_category "shared-data" and name
"archives") so the accessModes list uses ReadWriteMany (or reads the value from
a new configurable .Values.clpConfig.archive_output.storage.accessMode) to allow
RWX mounts on multi-node filesystem deployments; ensure the change preserves
existing capacity and name parameters and falls back to ReadWriteOnce if RWX is
not requested.
In `@tools/deployment/package-helm/templates/shared-data-streams-pvc.yaml`:
- Around line 1-8: The PVC template currently hardcodes accessModes to
ReadWriteOnce in the clp.createPvc invocation for the "streams" shared-data
component, which prevents multi-node mounts; update the invocation in
templates/shared-data-streams-pvc.yaml to use ReadWriteMany (or expose
.Values.clpConfig.stream_output.storage.accessModes and default to
ReadWriteMany) instead of the fixed (list "ReadWriteOnce") so multiple worker
pods on different nodes can mount the same filesystem; ensure the change is
applied to the clp.createPvc call that supplies "component_category"
"shared-data" and "name" "streams".
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (73)
components/api-server/src/bin/api_server.rscomponents/clp-mcp-server/clp_mcp_server/clp_mcp_server.pycomponents/job-orchestration/job_orchestration/garbage_collector/archive_garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/search_result_garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/utils.pycomponents/job-orchestration/job_orchestration/reducer/reducer.pycomponents/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.pycomponents/job-orchestration/job_orchestration/scheduler/query/query_scheduler.pycomponents/log-ingestor/src/bin/log_ingestor.rsdocs/src/dev-docs/design-deployment-orchestration.mddocs/src/user-docs/guides-k8s-deployment.mddocs/src/user-docs/quick-start/clp-json.mddocs/src/user-docs/quick-start/clp-text.mdtools/deployment/package-helm/.set-up-common.shtools/deployment/package-helm/Chart.yamltools/deployment/package-helm/templates/_helpers.tpltools/deployment/package-helm/templates/api-server-deployment.yamltools/deployment/package-helm/templates/api-server-logs-pv.yamltools/deployment/package-helm/templates/api-server-logs-pvc.yamltools/deployment/package-helm/templates/compression-scheduler-deployment.yamltools/deployment/package-helm/templates/compression-scheduler-logs-pv.yamltools/deployment/package-helm/templates/compression-scheduler-logs-pvc.yamltools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yamltools/deployment/package-helm/templates/compression-scheduler-user-logs-pvc.yamltools/deployment/package-helm/templates/compression-worker-deployment.yamltools/deployment/package-helm/templates/compression-worker-logs-pv.yamltools/deployment/package-helm/templates/compression-worker-logs-pvc.yamltools/deployment/package-helm/templates/compression-worker-staged-archives-pv.yamltools/deployment/package-helm/templates/compression-worker-staged-archives-pvc.yamltools/deployment/package-helm/templates/compression-worker-tmp-pv.yamltools/deployment/package-helm/templates/compression-worker-tmp-pvc.yamltools/deployment/package-helm/templates/configmap.yamltools/deployment/package-helm/templates/database-data-pv.yamltools/deployment/package-helm/templates/database-logs-pv.yamltools/deployment/package-helm/templates/database-statefulset.yamltools/deployment/package-helm/templates/db-table-creator-job.yamltools/deployment/package-helm/templates/garbage-collector-deployment.yamltools/deployment/package-helm/templates/garbage-collector-logs-pv.yamltools/deployment/package-helm/templates/garbage-collector-logs-pvc.yamltools/deployment/package-helm/templates/log-ingestor-deployment.yamltools/deployment/package-helm/templates/log-ingestor-logs-pv.yamltools/deployment/package-helm/templates/log-ingestor-logs-pvc.yamltools/deployment/package-helm/templates/mcp-server-deployment.yamltools/deployment/package-helm/templates/mcp-server-logs-pv.yamltools/deployment/package-helm/templates/mcp-server-logs-pvc.yamltools/deployment/package-helm/templates/query-scheduler-deployment.yamltools/deployment/package-helm/templates/query-scheduler-logs-pv.yamltools/deployment/package-helm/templates/query-scheduler-logs-pvc.yamltools/deployment/package-helm/templates/query-worker-deployment.yamltools/deployment/package-helm/templates/query-worker-logs-pv.yamltools/deployment/package-helm/templates/query-worker-logs-pvc.yamltools/deployment/package-helm/templates/query-worker-staged-streams-pv.yamltools/deployment/package-helm/templates/query-worker-staged-streams-pvc.yamltools/deployment/package-helm/templates/queue-logs-pv.yamltools/deployment/package-helm/templates/queue-statefulset.yamltools/deployment/package-helm/templates/redis-data-pv.yamltools/deployment/package-helm/templates/redis-logs-pv.yamltools/deployment/package-helm/templates/redis-statefulset.yamltools/deployment/package-helm/templates/reducer-deployment.yamltools/deployment/package-helm/templates/reducer-logs-pv.yamltools/deployment/package-helm/templates/reducer-logs-pvc.yamltools/deployment/package-helm/templates/results-cache-data-pv.yamltools/deployment/package-helm/templates/results-cache-indices-creator-job.yamltools/deployment/package-helm/templates/results-cache-logs-pv.yamltools/deployment/package-helm/templates/results-cache-statefulset.yamltools/deployment/package-helm/templates/shared-data-archives-pv.yamltools/deployment/package-helm/templates/shared-data-archives-pvc.yamltools/deployment/package-helm/templates/shared-data-streams-pv.yamltools/deployment/package-helm/templates/shared-data-streams-pvc.yamltools/deployment/package-helm/templates/storage-class.yamltools/deployment/package-helm/templates/webui-deployment.yamltools/deployment/package-helm/values.yaml
💤 Files with no reviewable changes (48)
- tools/deployment/package-helm/templates/query-scheduler-deployment.yaml
- tools/deployment/package-helm/templates/compression-worker-tmp-pv.yaml
- tools/deployment/package-helm/templates/api-server-logs-pv.yaml
- tools/deployment/package-helm/templates/shared-data-archives-pv.yaml
- tools/deployment/package-helm/templates/webui-deployment.yaml
- docs/src/user-docs/quick-start/clp-text.md
- tools/deployment/package-helm/templates/compression-worker-logs-pv.yaml
- tools/deployment/package-helm/templates/garbage-collector-logs-pvc.yaml
- tools/deployment/package-helm/templates/compression-scheduler-user-logs-pvc.yaml
- tools/deployment/package-helm/templates/compression-scheduler-logs-pvc.yaml
- tools/deployment/package-helm/templates/query-worker-logs-pv.yaml
- tools/deployment/package-helm/templates/query-worker-staged-streams-pvc.yaml
- tools/deployment/package-helm/templates/log-ingestor-logs-pv.yaml
- tools/deployment/package-helm/templates/shared-data-streams-pv.yaml
- tools/deployment/package-helm/templates/compression-worker-staged-archives-pv.yaml
- tools/deployment/package-helm/templates/queue-logs-pv.yaml
- tools/deployment/package-helm/templates/compression-scheduler-user-logs-pv.yaml
- tools/deployment/package-helm/templates/compression-scheduler-logs-pv.yaml
- tools/deployment/package-helm/templates/results-cache-data-pv.yaml
- tools/deployment/package-helm/templates/compression-worker-logs-pvc.yaml
- tools/deployment/package-helm/templates/garbage-collector-logs-pv.yaml
- tools/deployment/package-helm/templates/redis-logs-pv.yaml
- tools/deployment/package-helm/templates/log-ingestor-logs-pvc.yaml
- tools/deployment/package-helm/templates/mcp-server-deployment.yaml
- docs/src/user-docs/quick-start/clp-json.md
- tools/deployment/package-helm/templates/db-table-creator-job.yaml
- tools/deployment/package-helm/templates/query-scheduler-logs-pv.yaml
- tools/deployment/package-helm/templates/api-server-deployment.yaml
- tools/deployment/package-helm/templates/reducer-logs-pvc.yaml
- tools/deployment/package-helm/templates/reducer-deployment.yaml
- tools/deployment/package-helm/templates/query-worker-staged-streams-pv.yaml
- tools/deployment/package-helm/templates/api-server-logs-pvc.yaml
- tools/deployment/package-helm/templates/compression-worker-tmp-pvc.yaml
- tools/deployment/package-helm/templates/database-logs-pv.yaml
- tools/deployment/package-helm/templates/storage-class.yaml
- tools/deployment/package-helm/templates/query-worker-logs-pvc.yaml
- tools/deployment/package-helm/templates/database-data-pv.yaml
- tools/deployment/package-helm/templates/query-scheduler-logs-pvc.yaml
- tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml
- tools/deployment/package-helm/templates/mcp-server-logs-pv.yaml
- tools/deployment/package-helm/values.yaml
- tools/deployment/package-helm/templates/reducer-logs-pv.yaml
- tools/deployment/package-helm/templates/redis-data-pv.yaml
- tools/deployment/package-helm/templates/log-ingestor-deployment.yaml
- tools/deployment/package-helm/templates/database-statefulset.yaml
- tools/deployment/package-helm/templates/results-cache-logs-pv.yaml
- tools/deployment/package-helm/templates/mcp-server-logs-pvc.yaml
- tools/deployment/package-helm/templates/compression-worker-staged-archives-pvc.yaml
| reducer_log_files: list[TextIO] = [] | ||
| concurrency = max(int(parsed_args.concurrency), 1) | ||
| for i in range(concurrency): | ||
| reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)] | ||
|
|
||
| log_file_path = logs_dir / ("reducer-" + str(i) + ".log") | ||
| log_file = open(log_file_path, "a") | ||
| reducer_stdout = sys.stdout | ||
| reducer_stderr = sys.stderr | ||
| if logs_dir is not None: | ||
| log_file_path = logs_dir / f"reducer-{i}.log" | ||
| log_file = open(log_file_path, "a") | ||
| reducer_log_files.append(log_file) | ||
| reducer_stdout = log_file | ||
| reducer_stderr = log_file | ||
|
|
||
| reducers.append( | ||
| subprocess.Popen( | ||
| reducer_instance_cmd, | ||
| close_fds=True, | ||
| stdout=log_file, | ||
| stderr=log_file, | ||
| stdout=reducer_stdout, | ||
| stderr=reducer_stderr, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "reducer.py" | grep job-orchestration | head -5Repository: y-scope/clp
Length of output: 122
🏁 Script executed:
cat -n ./components/job-orchestration/job_orchestration/reducer/reducer.py | sed -n '1,150p'Repository: y-scope/clp
Length of output: 4726
🏁 Script executed:
cat -n ./components/job-orchestration/job_orchestration/reducer/reducer.py | sed -n '19,114p' | grep -E "try:|finally:|except"Repository: y-scope/clp
Length of output: 147
Ensure reducer log files and handlers are closed on exceptions.
If open() or subprocess.Popen() raises during the spawn loop (lines 73-92), file handles in reducer_log_files and the optional logging_file_handler won't be closed. Currently, the cleanup code (lines 104-111) is only reached if the loop completes successfully. Wrap lines 73-103 in a try/finally block to guarantee cleanup of all resources.
🧹 Proposed fix
reducers = []
reducer_log_files: list[TextIO] = []
concurrency = max(int(parsed_args.concurrency), 1)
- for i in range(concurrency):
- reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)]
-
- reducer_stdout = sys.stdout
- reducer_stderr = sys.stderr
- if logs_dir is not None:
- log_file_path = logs_dir / f"reducer-{i}.log"
- log_file = open(log_file_path, "a")
- reducer_log_files.append(log_file)
- reducer_stdout = log_file
- reducer_stderr = log_file
-
- reducers.append(
- subprocess.Popen(
- reducer_instance_cmd,
- close_fds=True,
- stdout=reducer_stdout,
- stderr=reducer_stderr,
- )
- )
-
- logger.info("Reducers started.")
- logger.info(
- f"Host={clp_config.reducer.host}"
- f" Base port={clp_config.reducer.base_port}"
- f" Concurrency={concurrency}"
- f" Upsert Interval={parsed_args.upsert_interval}"
- )
- for i, reducer in enumerate(reducers):
- reducer.communicate()
- logger.info(f"reducer-{i} exited with returncode={reducer.returncode}")
- for reducer_log_file in reducer_log_files:
- reducer_log_file.close()
-
- logger.error("All reducers terminated")
-
- if logging_file_handler is not None:
- logger.removeHandler(logging_file_handler)
- logging_file_handler.close()
+ try:
+ for i in range(concurrency):
+ reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)]
+
+ reducer_stdout = sys.stdout
+ reducer_stderr = sys.stderr
+ if logs_dir is not None:
+ log_file_path = logs_dir / f"reducer-{i}.log"
+ log_file = open(log_file_path, "a")
+ reducer_log_files.append(log_file)
+ reducer_stdout = log_file
+ reducer_stderr = log_file
+
+ reducers.append(
+ subprocess.Popen(
+ reducer_instance_cmd,
+ close_fds=True,
+ stdout=reducer_stdout,
+ stderr=reducer_stderr,
+ )
+ )
+
+ logger.info("Reducers started.")
+ logger.info(
+ f"Host={clp_config.reducer.host}"
+ f" Base port={clp_config.reducer.base_port}"
+ f" Concurrency={concurrency}"
+ f" Upsert Interval={parsed_args.upsert_interval}"
+ )
+ for i, reducer in enumerate(reducers):
+ reducer.communicate()
+ logger.info(f"reducer-{i} exited with returncode={reducer.returncode}")
+
+ logger.error("All reducers terminated")
+ finally:
+ for reducer_log_file in reducer_log_files:
+ reducer_log_file.close()
+ if logging_file_handler is not None:
+ logger.removeHandler(logging_file_handler)
+ logging_file_handler.close()🧰 Tools
🪛 Ruff (0.15.2)
[warning] 74-74: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
[error] 86-86: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/job-orchestration/job_orchestration/reducer/reducer.py` around
lines 71 - 90, The loop that spawns reducers (building reducer_instance_cmd,
opening log_file and calling subprocess.Popen) must be wrapped in a try/finally
so any exception during open() or Popen() still closes resources: in the finally
iterate reducer_log_files and call .close() for each opened file and
remove/close the optional logging_file_handler, and also terminate/cleanup any
partially-started processes recorded in reducers; update code around
reducer_log_files, reducers, reducer_instance_cmd and subprocess.Popen to ensure
files and handlers are closed regardless of exceptions.
| # Creates required directories for CLP data | ||
| create_clp_directories() { | ||
| echo "Creating CLP directories at ${CLP_HOME}..." | ||
| mkdir -p "$CLP_HOME/var/"{data,log}/{database,queue,redis,results_cache} \ | ||
| "$CLP_HOME/var/data/"{archives,streams,staged-archives,staged-streams} \ | ||
| "$CLP_HOME/var/log/"{compression_scheduler,compression_worker,user} \ | ||
| "$CLP_HOME/var/log/"{query_scheduler,query_worker,reducer} \ | ||
| "$CLP_HOME/var/log/"{api_server,garbage_collector,log_ingestor,mcp_server} \ | ||
| "$CLP_HOME/var/tmp" \ | ||
| "$CLP_HOME/samples" | ||
| mkdir -p "$CLP_HOME/samples" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Update the function comment to reflect its narrowed scope.
The body now only creates the samples subdirectory, so the phrase "CLP data" in the JSDoc-style comment is a bit misleading.
✏️ Suggested comment wording
-# Creates required directories for CLP data
+# Creates required directories under CLP_HOME
create_clp_directories() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/.set-up-common.sh` around lines 10 - 13, Update
the JSDoc-style comment for the create_clp_directories() function to reflect its
narrowed behavior: it now only creates the "samples" subdirectory under CLP_HOME
rather than all "CLP data". Edit the comment to briefly state that the function
creates the CLP samples directory (e.g., "Creates the CLP samples directory at
${CLP_HOME}/samples") and remove or replace the misleading "CLP data" phrasing
so the comment matches the mkdir -p "$CLP_HOME/samples" implementation.
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "query-worker" | ||
| "name" "staged-streams" | ||
| ) | nindent 10 }} | ||
| ) | quote }} | ||
| emptyDir: {} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider setting a sizeLimit on the emptyDir for staged streams.
The emptyDir: {} for staged streams has no size limit. If the query worker stages large volumes of stream data before uploading to S3, this could consume significant node disk space and potentially affect other pods on the same node. A sizeLimit acts as a guardrail:
emptyDir:
sizeLimit: "5Gi"This is non-blocking — just a resilience improvement to prevent unbounded disk usage on the node.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml` around
lines 106 - 110, The emptyDir for the staged-streams volume currently has no
sizeLimit; update the Helm template that defines the volume (the include
"clp.volumeName" with dict
"component_category":"query-worker","name":"staged-streams") to add a sizeLimit
to the emptyDir (e.g., "5Gi") or wire it to a values key so operators can
configure it, ensuring the emptyDir: block includes sizeLimit to prevent
unbounded node disk usage.
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "redis" | ||
| "name" "data" | ||
| ) | quote }} | ||
| spec: | ||
| accessModes: ["ReadWriteOnce"] | ||
| storageClassName: "local-storage" | ||
| selector: | ||
| matchLabels: | ||
| {{- include "clp.selectorLabels" . | nindent 12 }} | ||
| app.kubernetes.io/component: "redis" | ||
| resources: | ||
| requests: | ||
| storage: "10Gi" | ||
| - metadata: | ||
| name: {{ include "clp.volumeName" (dict | ||
| "component_category" "redis" | ||
| "name" "logs" | ||
| ) | quote }} | ||
| spec: | ||
| accessModes: ["ReadWriteOnce"] | ||
| storageClassName: "local-storage" | ||
| selector: | ||
| matchLabels: | ||
| {{- include "clp.selectorLabels" . | nindent 12 }} | ||
| app.kubernetes.io/component: "redis" | ||
| resources: | ||
| requests: | ||
| storage: "5Gi" | ||
| emptyDir: {} | ||
| - name: "config" | ||
| configMap: | ||
| name: {{ include "clp.fullname" . }}-config |
There was a problem hiding this comment.
Fix YAMLlint brace-spacing errors in the templated volume block.
Static analysis flagged braces on this block. Either adjust brace spacing or exempt Helm templates from that rule to keep linting green.
🔧 One possible brace-spacing adjustment
- - name: {{ include "clp.volumeName" (dict
+ - name: {{include "clp.volumeName" (dict
"component_category" "redis"
"name" "data"
- ) | quote }}
+ ) | quote}}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: {{ include "clp.volumeName" (dict | |
| "component_category" "redis" | |
| "name" "data" | |
| ) | quote }} | |
| spec: | |
| accessModes: ["ReadWriteOnce"] | |
| storageClassName: "local-storage" | |
| selector: | |
| matchLabels: | |
| {{- include "clp.selectorLabels" . | nindent 12 }} | |
| app.kubernetes.io/component: "redis" | |
| resources: | |
| requests: | |
| storage: "10Gi" | |
| - metadata: | |
| name: {{ include "clp.volumeName" (dict | |
| "component_category" "redis" | |
| "name" "logs" | |
| ) | quote }} | |
| spec: | |
| accessModes: ["ReadWriteOnce"] | |
| storageClassName: "local-storage" | |
| selector: | |
| matchLabels: | |
| {{- include "clp.selectorLabels" . | nindent 12 }} | |
| app.kubernetes.io/component: "redis" | |
| resources: | |
| requests: | |
| storage: "5Gi" | |
| emptyDir: {} | |
| - name: "config" | |
| configMap: | |
| name: {{ include "clp.fullname" . }}-config | |
| - name: {{include "clp.volumeName" (dict | |
| "component_category" "redis" | |
| "name" "data" | |
| ) | quote}} | |
| emptyDir: {} | |
| - name: "config" | |
| configMap: | |
| name: {{ include "clp.fullname" . }}-config |
🧰 Tools
🪛 YAMLlint (1.38.0)
[error] 63-63: too many spaces inside braces
(braces)
[error] 66-66: too many spaces inside braces
(braces)
[error] 70-70: too many spaces inside braces
(braces)
[error] 70-70: too many spaces inside braces
(braces)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/redis-statefulset.yaml` around lines
63 - 70, The templated volume block triggers YAMLlint brace-spacing errors; fix
by normalizing Helm template whitespace inside the mustaches for the volume name
and configMap name: collapse the multiline dict into a single expression and
ensure single spaces immediately after '{{' and before '}}' for the include
calls (references: include "clp.volumeName" and include "clp.fullname"), so the
volume name and configMap name lines render with properly spaced braces and no
extra line breaks.
| accessModes: ["ReadWriteOnce"] | ||
| storageClassName: "local-storage" | ||
| selector: | ||
| matchLabels: | ||
| {{- include "clp.selectorLabels" . | nindent 12 }} | ||
| app.kubernetes.io/component: "results-cache" | ||
| resources: | ||
| requests: | ||
| storage: "20Gi" | ||
| - metadata: | ||
| name: {{ include "clp.volumeName" (dict | ||
| "component_category" "results-cache" | ||
| "name" "logs" | ||
| ) | quote }} | ||
| spec: | ||
| accessModes: ["ReadWriteOnce"] | ||
| storageClassName: "local-storage" | ||
| selector: | ||
| matchLabels: | ||
| {{- include "clp.selectorLabels" . | nindent 12 }} | ||
| app.kubernetes.io/component: "results-cache" | ||
| resources: | ||
| requests: | ||
| storage: "5Gi" | ||
| storage: "10Gi" |
There was a problem hiding this comment.
Prevent PVC shrink on upgrade (K8s forbids size reduction).
Reducing the claim from 20Gi to 10Gi will be rejected on clusters with existing 20Gi PVCs, causing Helm upgrades to fail. Please keep the previous size, preserve the existing PVC size (e.g., via values or lookup), or document a required manual migration path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/results-cache-statefulset.yaml`
around lines 65 - 68, The templates currently set resources.requests.storage to
a smaller hard-coded value which will cause Helm upgrades to fail when an
existing PVC is larger; change the template so volumeClaimTemplates (or the
field where resources.requests.storage is defined) uses a configurable value
(e.g., .Values.persistence.size) and fall back to the existing PVC size via the
Helm lookup function if present (use lookup to read the current
PersistentVolumeClaim and use its spec.resources.requests.storage), or keep the
previous default size (e.g., 20Gi) to avoid shrinking; update the template to
reference that value instead of the hard-coded "10Gi".
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/clp-rust-utils/src/logging.rs`:
- Around line 13-14: The branch that handles CLP_LOGS_DIR currently treats an
empty string as a valid path, causing the appender to be rooted at "."; change
the check that reads std::env::var("CLP_LOGS_DIR") so it only proceeds when the
value is non-empty (e.g., use if let Ok(dir) = std::env::var("CLP_LOGS_DIR") and
guard with !dir.trim().is_empty()), then convert that non-empty string to a Path
with std::path::Path::new(dir.as_str()) and continue creating the file appender;
ensure you reference the CLP_LOGS_DIR env var and the local logs_directory
binding in logging.rs when implementing this guard.
- Around line 18-47: The duplicated tracing_subscriber setup should be extracted
into a shared builder/format variable: create the common event_format (the value
from
tracing_subscriber::fmt::format().with_level(true).with_target(false).with_file(true).with_line_number(true).json())
and the common EnvFilter/ANSI settings once, then call .with_writer(...) with
the differing writers (std::io::stdout.and(non_blocking_writer) in the branch
that returns Some(guard) and std::io::stdout in the other) and call .init() for
each; because Format is owned, clone the shared format/builder (use .clone())
when reusing it for the second call so ownership issues are avoided while
keeping the rest of the configuration identical.
In `@tools/deployment/package-helm/values.yaml`:
- Around line 17-18: Wrap the long comment that mentions filesystem storage and
PVC-backed directories so no line exceeds 100 characters; specifically split the
comment line that references archive_output.storage.type: "fs" and the sentence
about PVC-backed directories and examples (e.g., NFS/CephFS) into two or more
shorter comment lines so each is under the lint limit while preserving the
original wording and examples.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
components/api-server/Cargo.tomlcomponents/api-server/src/bin/api_server.rscomponents/clp-rust-utils/Cargo.tomlcomponents/clp-rust-utils/src/lib.rscomponents/clp-rust-utils/src/logging.rscomponents/log-ingestor/Cargo.tomlcomponents/log-ingestor/src/bin/log_ingestor.rstools/deployment/package-helm/values.yaml
💤 Files with no reviewable changes (2)
- components/api-server/Cargo.toml
- components/log-ingestor/Cargo.toml
| if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR") { | ||
| let logs_directory = std::path::Path::new(logs_directory.as_str()); |
There was a problem hiding this comment.
Empty CLP_LOGS_DIR will create a file appender rooted at the current working directory.
std::env::var returns Ok("") when the variable is set but empty, so this branch will attempt to write log files into ".". Consider checking for a non-empty value:
🛡️ Proposed fix
- if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR") {
+ if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR").and_then(|v| {
+ if v.is_empty() {
+ Err(std::env::VarError::NotPresent)
+ } else {
+ Ok(v)
+ }
+ }) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR") { | |
| let logs_directory = std::path::Path::new(logs_directory.as_str()); | |
| if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR").and_then(|v| { | |
| if v.is_empty() { | |
| Err(std::env::VarError::NotPresent) | |
| } else { | |
| Ok(v) | |
| } | |
| }) { | |
| let logs_directory = std::path::Path::new(logs_directory.as_str()); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/clp-rust-utils/src/logging.rs` around lines 13 - 14, The branch
that handles CLP_LOGS_DIR currently treats an empty string as a valid path,
causing the appender to be rooted at "."; change the check that reads
std::env::var("CLP_LOGS_DIR") so it only proceeds when the value is non-empty
(e.g., use if let Ok(dir) = std::env::var("CLP_LOGS_DIR") and guard with
!dir.trim().is_empty()), then convert that non-empty string to a Path with
std::path::Path::new(dir.as_str()) and continue creating the file appender;
ensure you reference the CLP_LOGS_DIR env var and the local logs_directory
binding in logging.rs when implementing this guard.
| tracing_subscriber::fmt() | ||
| .event_format( | ||
| tracing_subscriber::fmt::format() | ||
| .with_level(true) | ||
| .with_target(false) | ||
| .with_file(true) | ||
| .with_line_number(true) | ||
| .json(), | ||
| ) | ||
| .with_env_filter(tracing_subscriber::EnvFilter::from_default_env()) | ||
| .with_ansi(false) | ||
| .with_writer(std::io::stdout.and(non_blocking_writer)) | ||
| .init(); | ||
| return Some(guard); | ||
| } | ||
|
|
||
| tracing_subscriber::fmt() | ||
| .event_format( | ||
| tracing_subscriber::fmt::format() | ||
| .with_level(true) | ||
| .with_target(false) | ||
| .with_file(true) | ||
| .with_line_number(true) | ||
| .json(), | ||
| ) | ||
| .with_env_filter(tracing_subscriber::EnvFilter::from_default_env()) | ||
| .with_ansi(false) | ||
| .with_writer(std::io::stdout) | ||
| .init(); | ||
| None |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
find . -name "Cargo.toml" -path "*/clp-rust-utils/*" | head -5Repository: y-scope/clp
Length of output: 93
🏁 Script executed:
cat components/clp-rust-utils/Cargo.toml | grep -A 5 "tracing"Repository: y-scope/clp
Length of output: 262
🏁 Script executed:
cat -n components/clp-rust-utils/src/logging.rs | head -60Repository: y-scope/clp
Length of output: 2293
🌐 Web query:
tracing_subscriber 0.3.22 Format type Clone implementation
💡 Result:
In tracing-subscriber 0.3.22, fmt::format::Format implements Clone conditionally:
impl<F: Clone, T: Clone> Clone for Format<F, T>So Format<F, T> is Clone iff both the format style type F (e.g. Full, Compact, Pretty, Json) and the timer type T implement Clone. [1]
Source: the rustdoc “Trait Implementations” section for Format. [1]
[1] https://docs.rs/tracing-subscriber/latest/tracing_subscriber/fmt/format/struct.Format.html
🌐 Web query:
tracing_subscriber fmt format() builder Clone trait
💡 Result:
In tracing-subscriber (e.g., 0.3.22), tracing_subscriber::fmt::SubscriberBuilder does not implement Clone (its documented trait impls are Debug, Default, and From, but not Clone). [1]
What is Clone is the formatter returned by tracing_subscriber::fmt::format() (fmt::format::Format implements Clone when its inner types do). [2]
So the usual pattern is: clone the Format (and any other clonable config you own), and create a fresh fmt()/Subscriber::builder() each time, rather than trying to clone the SubscriberBuilder. [1][2]
Sources:
[1] SubscriberBuilder trait impls (no Clone) — docs.rs tracing_subscriber::fmt::SubscriberBuilder
[2] Format trait impls (Clone) — docs.rs tracing_subscriber::fmt::format::Format
Extract duplicated subscriber configuration to reduce duplication.
The event format, env filter, and ANSI settings are identical between both branches—only the writer differs. The Format type from tracing_subscriber implements Clone, so the shared configuration can be extracted. Note that when reusing the format variable, you'll need to .clone() it since the builder methods take ownership:
♻️ Proposed refactor
pub fn set_up_logging(log_filename: &str) -> Option<WorkerGuard> {
+ let event_format = tracing_subscriber::fmt::format()
+ .with_level(true)
+ .with_target(false)
+ .with_file(true)
+ .with_line_number(true)
+ .json();
+
if let Ok(logs_directory) = std::env::var("CLP_LOGS_DIR") {
let logs_directory = std::path::Path::new(logs_directory.as_str());
let file_appender =
RollingFileAppender::new(Rotation::HOURLY, logs_directory, log_filename);
let (non_blocking_writer, guard) = tracing_appender::non_blocking(file_appender);
tracing_subscriber::fmt()
- .event_format(
- tracing_subscriber::fmt::format()
- .with_level(true)
- .with_target(false)
- .with_file(true)
- .with_line_number(true)
- .json(),
- )
+ .event_format(event_format.clone())
.with_env_filter(tracing_subscriber::EnvFilter::from_default_env())
.with_ansi(false)
.with_writer(std::io::stdout.and(non_blocking_writer))
.init();
return Some(guard);
}
tracing_subscriber::fmt()
- .event_format(
- tracing_subscriber::fmt::format()
- .with_level(true)
- .with_target(false)
- .with_file(true)
- .with_line_number(true)
- .json(),
- )
+ .event_format(event_format.clone())
.with_env_filter(tracing_subscriber::EnvFilter::from_default_env())
.with_ansi(false)
.with_writer(std::io::stdout)
.init();
None
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/clp-rust-utils/src/logging.rs` around lines 18 - 47, The
duplicated tracing_subscriber setup should be extracted into a shared
builder/format variable: create the common event_format (the value from
tracing_subscriber::fmt::format().with_level(true).with_target(false).with_file(true).with_line_number(true).json())
and the common EnvFilter/ANSI settings once, then call .with_writer(...) with
the differing writers (std::io::stdout.and(non_blocking_writer) in the branch
that returns Some(guard) and std::io::stdout in the other) and call .init() for
each; because Format is owned, clone the shared format/builder (use .clone())
when reusing it for the second call so ownership issues are avoided while
keeping the rest of the configuration identical.
…h centralized logging setup
…an up logger handling
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
components/job-orchestration/job_orchestration/reducer/reducer.py (1)
65-97:⚠️ Potential issue | 🟡 MinorOpened reducer log files are leaked if an exception occurs mid-loop.
If
open()(line 72) orsubprocess.Popen(line 78) raises on iteration N, all file handles opened during iterations 0…N-1 (and on the current iteration foropen()) are already inreducer_log_filesbut the cleanup loop at lines 96-97 is never reached. Wrap the spawn loop and thecommunicate()block in atry/finallyto guarantee cleanup.🛡️ Proposed fix
reducers = [] reducer_log_files: list[TextIO] = [] concurrency = max(int(parsed_args.concurrency), 1) - for i in range(concurrency): - reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)] - - reducer_stdout = sys.stdout - reducer_stderr = sys.stderr - if logs_dir is not None: - log_file_path = logs_dir / f"reducer-{i}.log" - log_file = open(log_file_path, "a") - reducer_log_files.append(log_file) - reducer_stdout = log_file - reducer_stderr = log_file - - reducers.append( - subprocess.Popen( - reducer_instance_cmd, - close_fds=True, - stdout=reducer_stdout, - stderr=reducer_stderr, - ) - ) - - logger.info("Reducers started.") - logger.info( - f"Host={clp_config.reducer.host}" - f" Base port={clp_config.reducer.base_port}" - f" Concurrency={concurrency}" - f" Upsert Interval={parsed_args.upsert_interval}" - ) - for i, reducer in enumerate(reducers): - reducer.communicate() - logger.info(f"reducer-{i} exited with returncode={reducer.returncode}") - for reducer_log_file in reducer_log_files: - reducer_log_file.close() - - logger.error("All reducers terminated") + try: + for i in range(concurrency): + reducer_instance_cmd = [*reducer_cmd, str(clp_config.reducer.base_port + i)] + + reducer_stdout = sys.stdout + reducer_stderr = sys.stderr + if logs_dir is not None: + log_file_path = logs_dir / f"reducer-{i}.log" + log_file = open(log_file_path, "a") + reducer_log_files.append(log_file) + reducer_stdout = log_file + reducer_stderr = log_file + + reducers.append( + subprocess.Popen( + reducer_instance_cmd, + close_fds=True, + stdout=reducer_stdout, + stderr=reducer_stderr, + ) + ) + + logger.info("Reducers started.") + logger.info( + f"Host={clp_config.reducer.host}" + f" Base port={clp_config.reducer.base_port}" + f" Concurrency={concurrency}" + f" Upsert Interval={parsed_args.upsert_interval}" + ) + for i, reducer in enumerate(reducers): + reducer.communicate() + logger.info(f"reducer-{i} exited with returncode={reducer.returncode}") + + logger.error("All reducers terminated") + finally: + for reducer_log_file in reducer_log_files: + reducer_log_file.close()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/job-orchestration/job_orchestration/reducer/reducer.py` around lines 65 - 97, The reducer log file handles are leaked if open() or subprocess.Popen raises during the spawn loop; wrap the whole spawn-and-wait section (the loop that builds reducer_instance_cmd, opens files into reducer_log_files, appends subprocess.Popen into reducers, and the subsequent for i, reducer in enumerate(reducers): reducer.communicate() block) in a try/finally so cleanup always runs; in the finally iterate over reducers to gracefully stop/communicate/kill any started subprocesses (e.g., call terminate() and then communicate() or kill() if needed) and then iterate over reducer_log_files to close every open file handle, ensuring both reducer_log_files and reducers are referenced in the finally block so no file or process remains leaked.tools/deployment/package-helm/templates/query-worker-deployment.yaml (1)
106-110: Consider setting asizeLimiton theemptyDirfor staged streams.The
emptyDir: {}for staged-streams has no size limit, which could lead to unbounded node disk usage when staging large volumes of stream data for S3 upload.emptyDir: sizeLimit: "5Gi"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml` around lines 106 - 110, The emptyDir for the staged-streams volume currently has no size limit which can exhaust node disk; update the deployment template where the volume name is generated via include "clp.volumeName" with dict("component_category":"query-worker","name":"staged-streams") to replace the bare emptyDir: {} with an emptyDir that specifies a sizeLimit (e.g., "5Gi") or make it configurable via a Helm value, so the staged-streams emptyDir has an explicit size cap.tools/deployment/package-helm/templates/compression-worker-deployment.yaml (1)
41-44: SameCLP_LOGS_DIRwithout backing volume issue as the query-worker.As noted in the query-worker review,
CLP_LOGS_DIRis set to/var/log/compression_workerbut the corresponding volume has been removed. Any file writes to this path will go to the container overlay. The same remediation applies: remove the env var or add anemptyDirmount.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml` around lines 41 - 44, The CLP_LOGS_DIR environment variable in the compression worker deployment points to /var/log/compression_worker but there is no backing volume; either remove the CLP_LOGS_DIR env entry or add an emptyDir volume and mount it into the compression worker container at /var/log/compression_worker. To fix, edit the compression-worker container spec: if keeping CLP_LOGS_DIR, add a volumeMount with mountPath "/var/log/compression_worker" and a matching volumes entry (type emptyDir) in the pod spec; otherwise delete the env var CLP_LOGS_DIR (and any related references) so logs default to CLP_WORKER_LOG_PATH (/dev/stdout). Ensure the changes target the compression-worker container definition and the CLP_LOGS_DIR env name.components/clp-py-utils/clp_py_utils/clp_logging.py (1)
43-63:⚠️ Potential issue | 🟠 MajorGuard FileHandler setup to avoid startup failure when
CLP_LOGS_DIRis bad.If
CLP_LOGS_DIRpoints to a missing or unwritable path,logging.FileHandlerwill raise and abort startup. Consider a safe fallback to stdout (Line 57 onward).🛡️ Suggested hardening
logs_dir = os.getenv("CLP_LOGS_DIR") if logs_dir: - log_file = pathlib.Path(logs_dir) / f"{component_name}.log" - logging_file_handler = logging.FileHandler(filename=log_file, encoding="utf-8") - logging_file_handler.setFormatter(get_logging_formatter()) - logger.addHandler(logging_file_handler) + try: + log_file = pathlib.Path(logs_dir) / f"{component_name}.log" + logging_file_handler = logging.FileHandler(filename=log_file, encoding="utf-8") + logging_file_handler.setFormatter(get_logging_formatter()) + logger.addHandler(logging_file_handler) + except OSError: + logger.exception( + "Failed to set up file logging for %s; continuing with stdout.", component_name + ) set_logging_level(logger, os.getenv("CLP_LOGGING_LEVEL"))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/clp-py-utils/clp_py_utils/clp_logging.py` around lines 43 - 63, The FileHandler creation in configure_logging can raise if CLP_LOGS_DIR is missing or unwritable; wrap the Path/ FileHandler setup in a try/except and harden it by ensuring the directory exists (use pathlib.Path(...).parent.mkdir(parents=True, exist_ok=True)) before creating logging.FileHandler, and on exception fall back to adding a std stream handler (logging.StreamHandler(sys.stdout)) and emit a warning via logger.warning so startup doesn't abort; keep calls to get_logging_formatter() and set_logging_level(logger, os.getenv("CLP_LOGGING_LEVEL")) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/clp-py-utils/clp_py_utils/clp_logging.py`:
- Around line 31-37: The warning call in set_logging_level eagerly formats the
message; change logger.warning(f"Invalid logging level: {level}, using INFO as
default") to use lazy formatting (logger.warning("Invalid logging level: %s,
using INFO as default", level)) so the string interpolation is deferred when the
warning level is disabled; keep the existing behavior of falling back to
logger.setLevel(logging.INFO) and the checks against get_args(LoggingLevel).
In `@components/job-orchestration/job_orchestration/reducer/reducer.py`:
- Line 66: Replace the list concatenation that builds reducer_instance_cmd with
iterable unpacking to follow RUF005: instead of reducer_instance_cmd =
reducer_cmd + [str(clp_config.reducer.base_port + i)], construct
reducer_instance_cmd using unpacking of reducer_cmd and the new string element
(referencing reducer_instance_cmd, reducer_cmd, clp_config.reducer.base_port,
and i) so the resulting list is created with [*reducer_cmd,
str(clp_config.reducer.base_port + i)] style.
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`:
- Around line 87-91: Add a sizeLimit to the emptyDir volumes for the tmp and
staged-archives volumes to prevent unbounded disk usage; locate the volume
blocks where name is generated with {{ include "clp.volumeName" ... "tmp" }} and
the staged-archives entry and add a sizeLimit (e.g. "1Gi" or another value tuned
to expected workload) under emptyDir: so the emptyDir declaration uses sizeLimit
to cap disk usage for compression-worker intermediate data and S3 staging.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml`:
- Around line 41-44: CLP_LOGS_DIR is set to /var/log/query_worker but there is
no corresponding volumeMount/volume, so container writes go to the ephemeral
overlay; either remove CLP_LOGS_DIR if file logs aren't needed or add a volume
and mount it at /var/log/query_worker (e.g., an emptyDir or a PVC) in the
Deployment spec for the container that defines CLP_LOGS_DIR and
CLP_WORKER_LOG_PATH; ensure the volume name used in volumeMounts matches the
volume entry and replicate the same change for the other worker deployment that
also sets CLP_LOGS_DIR.
---
Duplicate comments:
In `@components/clp-py-utils/clp_py_utils/clp_logging.py`:
- Around line 43-63: The FileHandler creation in configure_logging can raise if
CLP_LOGS_DIR is missing or unwritable; wrap the Path/ FileHandler setup in a
try/except and harden it by ensuring the directory exists (use
pathlib.Path(...).parent.mkdir(parents=True, exist_ok=True)) before creating
logging.FileHandler, and on exception fall back to adding a std stream handler
(logging.StreamHandler(sys.stdout)) and emit a warning via logger.warning so
startup doesn't abort; keep calls to get_logging_formatter() and
set_logging_level(logger, os.getenv("CLP_LOGGING_LEVEL")) intact.
In `@components/job-orchestration/job_orchestration/reducer/reducer.py`:
- Around line 65-97: The reducer log file handles are leaked if open() or
subprocess.Popen raises during the spawn loop; wrap the whole spawn-and-wait
section (the loop that builds reducer_instance_cmd, opens files into
reducer_log_files, appends subprocess.Popen into reducers, and the subsequent
for i, reducer in enumerate(reducers): reducer.communicate() block) in a
try/finally so cleanup always runs; in the finally iterate over reducers to
gracefully stop/communicate/kill any started subprocesses (e.g., call
terminate() and then communicate() or kill() if needed) and then iterate over
reducer_log_files to close every open file handle, ensuring both
reducer_log_files and reducers are referenced in the finally block so no file or
process remains leaked.
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`:
- Around line 41-44: The CLP_LOGS_DIR environment variable in the compression
worker deployment points to /var/log/compression_worker but there is no backing
volume; either remove the CLP_LOGS_DIR env entry or add an emptyDir volume and
mount it into the compression worker container at /var/log/compression_worker.
To fix, edit the compression-worker container spec: if keeping CLP_LOGS_DIR, add
a volumeMount with mountPath "/var/log/compression_worker" and a matching
volumes entry (type emptyDir) in the pod spec; otherwise delete the env var
CLP_LOGS_DIR (and any related references) so logs default to CLP_WORKER_LOG_PATH
(/dev/stdout). Ensure the changes target the compression-worker container
definition and the CLP_LOGS_DIR env name.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml`:
- Around line 106-110: The emptyDir for the staged-streams volume currently has
no size limit which can exhaust node disk; update the deployment template where
the volume name is generated via include "clp.volumeName" with
dict("component_category":"query-worker","name":"staged-streams") to replace the
bare emptyDir: {} with an emptyDir that specifies a sizeLimit (e.g., "5Gi") or
make it configurable via a Helm value, so the staged-streams emptyDir has an
explicit size cap.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
components/clp-mcp-server/clp_mcp_server/clp_mcp_server.pycomponents/clp-py-utils/clp_py_utils/clp_logging.pycomponents/job-orchestration/job_orchestration/garbage_collector/archive_garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/search_result_garbage_collector.pycomponents/job-orchestration/job_orchestration/garbage_collector/utils.pycomponents/job-orchestration/job_orchestration/reducer/reducer.pycomponents/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.pycomponents/job-orchestration/job_orchestration/scheduler/query/query_scheduler.pytools/deployment/package-helm/templates/compression-worker-deployment.yamltools/deployment/package-helm/templates/query-worker-deployment.yamltools/deployment/package-helm/values.yaml
💤 Files with no reviewable changes (1)
- components/job-orchestration/job_orchestration/garbage_collector/utils.py
| def set_logging_level(logger: logging.Logger, level: str | None): | ||
| if level is None: | ||
| logger.setLevel(logging.INFO) | ||
| return | ||
| if level not in get_args(LoggingLevel): | ||
| logger.warning(f"Invalid logging level: {level}, using INFO as default") | ||
| logger.setLevel(logging.INFO) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n components/clp-py-utils/clp_py_utils/clp_logging.py | sed -n '25,42p'Repository: y-scope/clp
Length of output: 671
🏁 Script executed:
rg -n "logger\.(debug|info|warning|error|exception)" components/clp-py-utils/clp_py_utils/clp_logging.pyRepository: y-scope/clp
Length of output: 138
Use lazy formatting for the warning log.
This avoids eager string interpolation when the warning level is disabled. Line 36 requires adjustment to comply with codebase logging standards.
♻️ Suggested change
- logger.warning(f"Invalid logging level: {level}, using INFO as default")
+ logger.warning("Invalid logging level: %s, using INFO as default", level)🧰 Tools
🪛 Ruff (0.15.2)
[warning] 36-36: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/clp-py-utils/clp_py_utils/clp_logging.py` around lines 31 - 37,
The warning call in set_logging_level eagerly formats the message; change
logger.warning(f"Invalid logging level: {level}, using INFO as default") to use
lazy formatting (logger.warning("Invalid logging level: %s, using INFO as
default", level)) so the string interpolation is deferred when the warning level
is disabled; keep the existing behavior of falling back to
logger.setLevel(logging.INFO) and the checks against get_args(LoggingLevel).
| reducer_log_files: list[TextIO] = [] | ||
| concurrency = max(int(parsed_args.concurrency), 1) | ||
| for i in range(concurrency): | ||
| reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Prefer iterable unpacking over list concatenation (RUF005).
♻️ Proposed refactor
- reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)]
+ reducer_instance_cmd = [*reducer_cmd, str(clp_config.reducer.base_port + i)]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| reducer_instance_cmd = reducer_cmd + [str(clp_config.reducer.base_port + i)] | |
| reducer_instance_cmd = [*reducer_cmd, str(clp_config.reducer.base_port + i)] |
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 66-66: Consider iterable unpacking instead of concatenation
Replace with iterable unpacking
(RUF005)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@components/job-orchestration/job_orchestration/reducer/reducer.py` at line
66, Replace the list concatenation that builds reducer_instance_cmd with
iterable unpacking to follow RUF005: instead of reducer_instance_cmd =
reducer_cmd + [str(clp_config.reducer.base_port + i)], construct
reducer_instance_cmd using unpacking of reducer_cmd and the new string element
(referencing reducer_instance_cmd, reducer_cmd, clp_config.reducer.base_port,
and i) so the resulting list is created with [*reducer_cmd,
str(clp_config.reducer.base_port + i)] style.
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "compression-worker" | ||
| "name" "tmp" | ||
| ) | nindent 10 }} | ||
| ) | quote }} | ||
| emptyDir: {} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding sizeLimit to emptyDir volumes for tmp and staged-archives.
Both the tmp volume (line 91) and the staged-archives volume (line 106) use emptyDir: {} without a sizeLimit. Compression workers can generate significant intermediate data — especially when staging archives for S3 upload or writing to /var/tmp during compression tasks. Without a limit, a misbehaving or high-throughput workload could exhaust node disk space and affect co-located pods.
🛡️ Suggested improvement
) | quote }}
- emptyDir: {}
+ emptyDir:
+ sizeLimit: "10Gi"Apply similar limits (tuned to expected workloads) to both emptyDir declarations.
Also applies to: 102-106
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/compression-worker-deployment.yaml`
around lines 87 - 91, Add a sizeLimit to the emptyDir volumes for the tmp and
staged-archives volumes to prevent unbounded disk usage; locate the volume
blocks where name is generated with {{ include "clp.volumeName" ... "tmp" }} and
the staged-archives entry and add a sizeLimit (e.g. "1Gi" or another value tuned
to expected workload) under emptyDir: so the emptyDir declaration uses sizeLimit
to cap disk usage for compression-worker intermediate data and S3 staging.
| - name: "CLP_LOGS_DIR" | ||
| value: "/var/log/query_worker" | ||
| - name: "CLP_WORKER_LOG_PATH" | ||
| value: "/var/log/query_worker/worker.log" | ||
| value: "/dev/stdout" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check references to CLP_LOGS_DIR in Python and shell code
echo "=== CLP_LOGS_DIR references (Python) ==="
rg 'CLP_LOGS_DIR' --type py -C 2
echo ""
echo "=== CLP_LOGS_DIR references (Shell) ==="
rg 'CLP_LOGS_DIR' --type sh -C 2
echo ""
echo "=== CLP_LOGS_DIR references (YAML) ==="
rg 'CLP_LOGS_DIR' --type yaml -C 2Repository: y-scope/clp
Length of output: 11195
🏁 Script executed:
#!/bin/bash
# Examine the query-worker-deployment.yaml file to see volumes and volume mounts
echo "=== Checking query-worker-deployment.yaml structure ==="
if [ -f "tools/deployment/package-helm/templates/query-worker-deployment.yaml" ]; then
echo "File exists. Total lines:"
wc -l "tools/deployment/package-helm/templates/query-worker-deployment.yaml"
echo ""
echo "=== Environment variables section (lines 40-50) ==="
sed -n '40,50p' "tools/deployment/package-helm/templates/query-worker-deployment.yaml"
echo ""
echo "=== Volumes section (lines 95-120) ==="
sed -n '95,120p' "tools/deployment/package-helm/templates/query-worker-deployment.yaml"
echo ""
echo "=== Volume mounts section (lines 50-75) ==="
sed -n '50,75p' "tools/deployment/package-helm/templates/query-worker-deployment.yaml"
else
echo "File not found"
fiRepository: y-scope/clp
Length of output: 2659
🏁 Script executed:
#!/bin/bash
# Check compression-worker-deployment.yaml for same pattern
echo "=== Checking compression-worker-deployment.yaml ==="
if [ -f "tools/deployment/package-helm/templates/compression-worker-deployment.yaml" ]; then
echo "File exists. Checking for CLP_LOGS_DIR:"
rg -n 'CLP_LOGS_DIR' "tools/deployment/package-helm/templates/compression-worker-deployment.yaml" -C 2
else
echo "File not found"
fiRepository: y-scope/clp
Length of output: 415
CLP_LOGS_DIR is set but has no backing volume mount in the Helm deployment.
CLP_LOGS_DIR is configured to /var/log/query_worker (line 42), but there is no corresponding volumeMount or volume for this path. The worker task code (extract_stream_task.py, fs_search_task.py, compression_task.py) actively reads and uses this directory to write log files. Without a volume mount, these writes land in the container's writable overlay layer — consuming node disk without limits and lost on container restart.
In docker-compose, this path is correctly backed by the clp-logs volume, but it is missing in the Helm deployment.
Either:
- Remove the
CLP_LOGS_DIRenv var if file-based logging is not needed, or - Add an
emptyDiror persistent volume mount at/var/log/query_worker.
The same pattern exists in compression-worker-deployment.yaml (lines 41–42).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/query-worker-deployment.yaml` around
lines 41 - 44, CLP_LOGS_DIR is set to /var/log/query_worker but there is no
corresponding volumeMount/volume, so container writes go to the ephemeral
overlay; either remove CLP_LOGS_DIR if file logs aren't needed or add a volume
and mount it at /var/log/query_worker (e.g., an emptyDir or a PVC) in the
Deployment spec for the container that defines CLP_LOGS_DIR and
CLP_WORKER_LOG_PATH; ensure the volume name used in volumeMounts matches the
volume entry and replicate the same change for the other worker deployment that
also sets CLP_LOGS_DIR.
# Conflicts: # tools/deployment/package-helm/templates/database-data-pv.yaml # tools/deployment/package-helm/templates/database-logs-pv.yaml # tools/deployment/package-helm/templates/database-statefulset.yaml # tools/deployment/package-helm/templates/db-table-creator-job.yaml # tools/deployment/package-helm/templates/queue-logs-pv.yaml # tools/deployment/package-helm/templates/queue-statefulset.yaml # tools/deployment/package-helm/templates/redis-data-pv.yaml # tools/deployment/package-helm/templates/redis-logs-pv.yaml # tools/deployment/package-helm/templates/redis-statefulset.yaml # tools/deployment/package-helm/templates/results-cache-data-pv.yaml # tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml # tools/deployment/package-helm/templates/results-cache-logs-pv.yaml # tools/deployment/package-helm/templates/results-cache-statefulset.yaml
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/deployment/package-helm/Chart.yaml`:
- Line 3: The Chart.yaml version was only incremented in the pre-release counter
("version: \"0.1.4-dev.4\"") despite this PR being a breaking change (removes
the local-storage StorageClass and many PV/PVC templates and values), so update
the Chart.yaml version to reflect semver-breaking changes—bump at least the
minor segment (e.g., "0.2.0-dev.0") or the major if your release policy requires
it (e.g., "1.0.0-dev.0") so consumers are correctly signaled; ensure the new
version string replaces the existing "0.1.4-dev.4" value in Chart.yaml.
- Line 3: This PR is a breaking change but the Chart.yaml "version" field
remains a pre-release patch; update the Chart.yaml version value (the version
key in Chart.yaml) to bump at least the MINOR for a breaking change—for example
change "version: \"0.1.4-dev.4\"" to "version: \"0.2.0-dev.0\"" (or to your
team's chosen stable-release signaling scheme) so the chart semver reflects the
breaking change.
In
`@tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml`:
- Around line 88-93: The emptyDir for the volume named via include
"clp.volumeName" with component_category "compression-scheduler" and name "log"
is unbounded; update the volume spec (the block that currently shows emptyDir:
{}) to include a sizeLimit (e.g. set emptyDir: { sizeLimit: "<appropriate-size>"
}) to cap disk usage and prevent node disk pressure, choosing a size appropriate
for compression-scheduler logs/artifacts; ensure the change only affects that
specific volume entry and keep the "config" volume unchanged.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (8)
docs/src/user-docs/quick-start/clp-json.mddocs/src/user-docs/quick-start/clp-text.mdtools/deployment/package-helm/Chart.yamltools/deployment/package-helm/templates/_helpers.tpltools/deployment/package-helm/templates/compression-scheduler-deployment.yamltools/deployment/package-helm/templates/configmap.yamltools/deployment/package-helm/templates/query-scheduler-deployment.yamltools/deployment/package-helm/values.yaml
💤 Files with no reviewable changes (3)
- docs/src/user-docs/quick-start/clp-text.md
- docs/src/user-docs/quick-start/clp-json.md
- tools/deployment/package-helm/templates/query-scheduler-deployment.yaml
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "compression-scheduler" | ||
| "name" "logs" | ||
| ) | nindent 10 }} | ||
| - {{- include "clp.pvcVolume" (dict | ||
| "root" . | ||
| "component_category" "compression-scheduler" | ||
| "name" "user-logs" | ||
| ) | nindent 10 }} | ||
| "name" "log" | ||
| ) | quote }} | ||
| emptyDir: {} | ||
| - name: "config" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider bounding the emptyDir size for better node-level resilience.
Line 92 currently creates an unbounded scratch volume. Adding a sizeLimit helps avoid disk pressure from unexpected log/artifact growth.
♻️ Proposed hardening
- name: {{ include "clp.volumeName" (dict
"component_category" "compression-scheduler"
"name" "log"
) | quote }}
- emptyDir: {}
+ emptyDir:
+ sizeLimit: "1Gi"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: {{ include "clp.volumeName" (dict | |
| "component_category" "compression-scheduler" | |
| "name" "logs" | |
| ) | nindent 10 }} | |
| - {{- include "clp.pvcVolume" (dict | |
| "root" . | |
| "component_category" "compression-scheduler" | |
| "name" "user-logs" | |
| ) | nindent 10 }} | |
| "name" "log" | |
| ) | quote }} | |
| emptyDir: {} | |
| - name: "config" | |
| - name: {{ include "clp.volumeName" (dict | |
| "component_category" "compression-scheduler" | |
| "name" "log" | |
| ) | quote }} | |
| emptyDir: | |
| sizeLimit: "1Gi" | |
| - name: "config" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@tools/deployment/package-helm/templates/compression-scheduler-deployment.yaml`
around lines 88 - 93, The emptyDir for the volume named via include
"clp.volumeName" with component_category "compression-scheduler" and name "log"
is unbounded; update the volume spec (the block that currently shows emptyDir:
{}) to include a sizeLimit (e.g. set emptyDir: { sizeLimit: "<appropriate-size>"
}) to cap disk usage and prevent node disk pressure, choosing a size appropriate
for compression-scheduler logs/artifacts; ensure the change only affects that
specific volume entry and keep the "config" volume unchanged.
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tools/deployment/package-helm/Chart.yaml (1)
3-3: LGTM — version bump correctly addresses the breaking-change semver concern.Advancing from
0.1.4-dev.4to0.2.0-dev.0resets the MINOR segment as recommended for a breaking change and resets the pre-release counter, which is the expected convention.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/deployment/package-helm/Chart.yaml` at line 3, The Chart.yaml version was correctly bumped to "0.2.0-dev.0" to reflect the breaking change; no code changes are required—confirm the version string in Chart.yaml (the version field) is exactly "0.2.0-dev.0" and keep the change as-is before merging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@tools/deployment/package-helm/Chart.yaml`:
- Line 3: The Chart.yaml version was correctly bumped to "0.2.0-dev.0" to
reflect the breaking change; no code changes are required—confirm the version
string in Chart.yaml (the version field) is exactly "0.2.0-dev.0" and keep the
change as-is before merging.
| - name: {{ include "clp.volumeName" (dict | ||
| "component_category" "query-worker" | ||
| "name" "logs" | ||
| ) | quote }} | ||
| mountPath: "/var/log/query_worker" |
There was a problem hiding this comment.
Just to confirm: this writes all %d-clo.log to a pod's folder, which is not persistent and is only accessible by kubectl exec to that pod?
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
The compression-worker and query-worker set CLP_LOGS_DIR to /var/log/compression_worker and /var/log/query_worker respectively, but no volume was mounted at those paths, causing FileNotFoundError when tasks try to write stderr log files. Add emptyDir volumes for both.
The {{- if has "..." .Values.clpConfig.bundled }} guards from PR y-scope#1681
were lost when resolving merge conflicts with main. Restore them for:
- database-statefulset.yaml
- db-table-creator-job.yaml (initContainers)
- queue-statefulset.yaml
- redis-statefulset.yaml
- results-cache-statefulset.yaml
- results-cache-indices-creator-job.yaml (initContainers)
Without these guards, Helm renders bundled third-party resources even
when configured to use external services.
The merge incorrectly hardcoded the MongoDB URI using clp.fullname instead of the clp.resultsCacheHost/clp.resultsCachePort helpers introduced in PR y-scope#1681 for external service support.
Replace the detailed directory listing and NFS setup instructions with a concise note about manually provisioning PVs and updating accessModes when using filesystem storage.
hoophalab
left a comment
There was a problem hiding this comment.
LGTM
Validations:
- docker compose logs are the same as before
- helm logs are written to stdout
- compressing and querying in webui on helm deploys works as before
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Description
Redirect all CLP service logs from host-path-backed PersistentVolumes to pod stdout/stderr, and
switch all persistent data storage from static host-path PVs to either dynamic provisioning via the
cluster's default StorageClass (database, results-cache, shared-data archives/streams) or
emptyDir(Redis data, staging/tmp volumes).
This eliminates all static PV/PVC template files and the custom
local-storageStorageClass,following Kubernetes best practices where logs are captured by
kubectl logs, ephemeral servicestate uses
emptyDir, and persistent data uses dynamically provisioned volumes.Summary of changes:
Source code (Python + Rust):
CLP_LOGS_DIRoptional in all scheduler, garbage-collector, reducer, and MCP servercomponents. When unset, services log only to stdout (console handler from
get_logger()).api_server,log_ingestor):set_up_logging()falls back to stdout-only whenCLP_LOGS_DIRis absent. Return type changed toOption<WorkerGuard>.Helm chart — PV/PVC:
local-storageStorageClass template.clp.createStaticPvhelper from_helpers.tpl. Updateclp.createPvcto use thecluster's default StorageClass (remove hardcoded
storageClassNameandselector).ReadWriteOncewith dynamic provisioning.volumeClaimTemplates: removestorageClassName,selector, andpersistentVolumeClaimRetentionPolicy.emptyDirfor: compression-worker tmp, compression-workerstaged-archives, query-worker staged-streams, redis data.
volumeClaimTemplateswith dynamicprovisioning (10Gi RWO) for persistent data across pod restarts.
volumeClaimTemplatesandpersistentVolumeClaimRetentionPolicyfrom redis statefulset.Helm chart — logs and securityContext:
CLP_LOGS_DIRenv var and log volume mounts from 9 deployments.CLP_WORKER_LOG_PATH=/dev/stdoutfor directstdout logging. Retain
CLP_LOGS_DIR=/tmpfor Celery executor tasks (not yet updated).emptyDirat/var/logfor compression-scheduler and garbage-collector (used foruser-failure logs and GC recovery files respectively).
securityContext(runAsUser/runAsGroup/fsGroup) from all statefulsets (database, redis,results-cache, queue) and all first-party deployments/jobs (api-server, compression-scheduler,
compression-worker, db-table-creator, garbage-collector, log-ingestor, mcp-server,
query-scheduler, query-worker, reducer, results-cache-indices-creator, webui). The CLP Docker
image already runs as the correct user.
securityContextsection entirely fromvalues.yaml. Removestoragesection anddata_directory/logs_directory/tmp_directoryentries (no longer referenced by any template).Infrastructure configs (configmap):
general_logis not enabled for either MariaDB or MySQL — the database process runsas a non-root user that cannot write to
/dev/stdout. Error log (stderr) is sufficient.logfile ""(logs to stdout).destination: fileandpathfromsystemLog(defaults to stdout).RABBITMQ_LOGS="-"andRABBITMQ_SASL_LOGS="-"(stdout).Setup scripts:
create_clp_directories()now only creates$CLP_HOME/samples(all data directories aredynamically provisioned or
emptyDir).Impact Assessment
local-storageStorageClass deleted. Remainingpersistent storage (database, results-cache, shared-data archives/streams) uses dynamically
provisioned PVCs via the cluster's default StorageClass.
kubectl logs. Previously requiredexec-ing into containers or inspecting host-path directories.
persistent via dynamically provisioned PVCs. Redis uses
emptyDirand RabbitMQ has no persistentvolume — their data is reconstructable and does not require persistence across pod restarts.
already sets the correct user, and dynamically provisioned volumes /
emptyDirdon't requireexplicit UID/GID configuration.
create_clp_directories()reduces to a singlemkdirfor thesamples directory. No host-path data directories need to be pre-created.
CLP_LOGS_DIRis no longer required for most services. Existing non-Helmdeployments that set
CLP_LOGS_DIRcontinue to work (file logging is additive).PV/PVC inventory
Log PV/PVCs
All log PV/PVCs are removed. Services now log to stdout/stderr via
kubectl logs.api-server-logs-pv.yaml*-logs-pvc.yamlcompression-scheduler-logs-pv.yaml*-logs-pvc.yaml/var/logfor user-failure logs)compression-worker-logs-pv.yaml*-logs-pvc.yamlgarbage-collector-logs-pv.yaml*-logs-pvc.yaml/var/logfor GC recovery files)log-ingestor-logs-pv.yaml*-logs-pvc.yamlmcp-server-logs-pv.yaml*-logs-pvc.yamlquery-scheduler-logs-pv.yaml*-logs-pvc.yamlquery-worker-logs-pv.yaml*-logs-pvc.yamlreducer-logs-pv.yaml*-logs-pvc.yamldatabase-logs-pv.yamlvolumeClaimTemplatesin statefulsetqueue-logs-pv.yamlvolumeClaimTemplatesin statefulsetredis-logs-pv.yamlvolumeClaimTemplatesin statefulsetresults-cache-logs-pv.yamlvolumeClaimTemplatesin statefulsetAdditionally,
compression-scheduler-user-logs-pv.yamland its PVC are removed (the emptyDir at/var/logreplaces it).Data PV/PVCs
database-data-pv.yamlvolumeClaimTemplatesin statefulsetredis-data-pv.yamlvolumeClaimTemplatesin statefulsetresults-cache-data-pv.yamlvolumeClaimTemplatesin statefulsetshared-data-archives-pv.yaml*-pvc.yamlshared-data-streams-pv.yaml*-pvc.yamlStaging/Tmp PV/PVCs
compression-worker-tmp-pv.yaml*-tmp-pvc.yamlcompression-worker-staged-archives-pv.yaml*-pvc.yamlquery-worker-staged-streams-pv.yaml*-pvc.yamlInfrastructure
storage-class.yamllocal-storageStorageClass no longer needed_helpers.tplclp.createStaticPv; simplifiedclp.createPvc(no storageClassName/selector)data_directory,logs_directory,tmp_directoryusageThese three entries have been removed from
values.yamlsince no Helm template references them.The configmap hardcodes the container-internal paths directly.
clpConfig.data_directory/tmp/clp/var/data/var/dataclpConfig.logs_directory/tmp/clp/var/log/var/logclpConfig.tmp_directory/tmp/clp/var/tmp/var/tmpThe configmap hardcodes
/var/data,/var/log, and/var/tmpas container-internal mount points.These correspond to dynamically provisioned PVCs, emptyDir volumes, or in-container paths depending
on the component. The old
values.yamlentries (/tmp/clp/var/...) were host paths consumed by thenow-deleted
clp.createStaticPvhelper.Documentation updates
docs/src/user-docs/quick-start/clp-json.mdCLP_HOME,mkdir -phost directory creation,$CLP_HOMEkind mount, and--set clpConfig.data_directory/logs_directory/tmp_directory/archive_output.storage.directory/stream_output.storage.directoryfromhelm install.docs/src/user-docs/quick-start/clp-text.mddocs/src/user-docs/guides-k8s-deployment.mddata_directory/logs_directory/tmp_directory, logging infra issue #1760 note). Removemkdir -phost directory creation and directory helm flags from "Basic installation". Update uninstall warning to reference dynamically provisioned PVCs instead of static PVs. Remove stalelogging-infra-issuelink.docs/src/dev-docs/design-deployment-orchestration.mdemptyDirfor ephemeral state, service logs to pod stdout/stderr.Checklist
breaking change.
Validation performed
0. Build local Docker image
Task: Build the CLP package Docker image with source code changes.
Command:
Build completed successfully — image
clp-package:dev-junhao-e83c.1. Test with MariaDB (default)
Task: Deploy the full Helm chart with MariaDB and a locally-built image. Verify all pods start
with no errors, dynamic provisioning creates 4 PVCs via the
standardStorageClass (kind'sdefault), and emptyDir log volumes are mounted on compression-worker and query-worker.
Command:
Output (2026-02-26 08:19 UTC):
All 14 pods running with 0 restarts.
1.1 Verify PVCs are dynamically provisioned
Command:
Output:
All 4 PVCs are
BoundwithSTORAGECLASS=standard.1.2 Verify emptyDir log volumes are mounted
Commands:
Output:
Both emptyDir log volumes are mounted and writable.
1.3 Verify compression and search (end-to-end)
Commands:
Output:
Compression job 2 completed successfully:
Stderr log file written to emptyDir:
Search jobs dispatched and query worker wrote log files to emptyDir:
Search tasks returned
return_code=1due to missing timestamp column in the archive (pre-existingarchive format limitation, not related to this PR).
2. Test with MySQL
Task: Deploy the Helm chart with
database.type=mysqland verify MySQL starts without errors.Command:
Output (2026-02-26 08:30 UTC):
All 14 pods running with 0 restarts.
2.1 Verify PVCs are dynamically provisioned
Command:
Output:
All 4 PVCs are
BoundwithSTORAGECLASS=standard.3. Test Docker Compose flow (
build/clp-package)Task: Start all services via Docker Compose, verify file logging still works when
CLP_LOGS_DIRis set, and run a compression job.
Commands:
cd build/clp-package bash sbin/start-clp.sh bash sbin/compress.sh /tmp/clp-validation/samples/postgresqlOutput (2026-02-26 08:37 UTC):
All services started and reported healthy. Compression completed successfully:
3.1 Verify log files are written (Docker Compose sets
CLP_LOGS_DIR)All services with
CLP_LOGS_DIRset produce log files on the host-mountedvar/log/directory:compression_scheduler.log(job dispatch/completion)worker.log,compression-job-1-task-1-stderr.logquery_scheduler.loggarbage_collector.log,search-result-garbage-collector.logreducer.log(Python),reducer-{0..7}.log(subprocesses)api_server.log.2026-02-26-08(hourly rotation)Docker logs (
docker logs <container>) also show the same output — confirming the additivebehaviour: when
CLP_LOGS_DIRis set, services log to both stdout and file.4. Test all-external services in Kubernetes (bundled guards, PR #1681 Scenario 4)
Task: Deploy with
bundled: []and all four services (database, queue, redis, results-cache)running as external Docker containers. Verify no StatefulSets/Services are created for bundled
resources, and all CLP pods connect to external hosts.
Setup — start four external services on host (192.168.3.89):
docker run -d --name ext-k8s-mariadb -p 192.168.3.89:13306:3306 \ -e MYSQL_ROOT_PASSWORD=root-pass -e MYSQL_DATABASE=clp-db \ -e MYSQL_USER=clp-user -e MYSQL_PASSWORD=pass mariadb:10-jammy docker run -d --name ext-k8s-rabbitmq -p 192.168.3.89:15672:5672 \ -e RABBITMQ_DEFAULT_USER=clp-user -e RABBITMQ_DEFAULT_PASS=pass rabbitmq:3.9.8 docker run -d --name ext-k8s-redis -p 192.168.3.89:16379:6379 \ redis:7.2.4 redis-server --requirepass 'pass' docker run -d --name ext-k8s-mongodb -p 192.168.3.89:17017:27017 \ mongo:7.0.1 mongod --replSet rs0 --bind_ip_allCommand:
Pod status (2026-02-26 13:43 UTC):
No database, queue, redis, or results-cache StatefulSets — all external.
4.1 Verify no bundled resources
Only 2 PVCs (shared-data), no database/results-cache PVCs.
4.2 Verify external service connections
compression-worker → external RabbitMQ + Redis:
query-scheduler → external MariaDB:
All CLP pods connect to external services. Bundled guards correctly exclude all
StatefulSet/Service resources when
bundled: [].