Skip to content

fix: docker healthcheck, CI optimization, and container hardening#436

Merged
Aureliolo merged 5 commits intomainfrom
fix/docker-healthcheck-and-ci
Mar 15, 2026
Merged

fix: docker healthcheck, CI optimization, and container hardening#436
Aureliolo merged 5 commits intomainfrom
fix/docker-healthcheck-and-ci

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

  • Fix healthcheck CMD prefix — Docker Compose array-form healthcheck test must start with "CMD"; added to CLI template + golden files
  • Optimize Trivy CI scans — merge two sequential scan steps (CRITICAL + HIGH) into a single JSON-based run with parsed results; fail on CRITICAL only, warn on HIGH (~25-30s saving per image)
  • Add CIS Docker Benchmark scanningtrivy image --compliance docker-cis-1.6.0 for both images (informational, continue-on-error: true)
  • Harden compose — resource limits (4G/2CPU backend, 256M/0.5CPU web), pids_limit (256/64), nodev on all tmpfs, log rotation (max-size: 10m, max-file: 3), named network (synthorg-net)
  • Pre-compress static assets — build-time gzip -9 -k in web Dockerfile + gzip_static on in nginx (zero CPU at serve time, better compression ratios)
  • Supply chain hardeningnpm ci --ignore-scripts in web builder stage
  • Consolidate Dockerfile layers — merge 3 RUN commands into 1 in backend setup stage and sandbox
  • Add OCI labels to sandbox Dockerfile
  • Tighten .dockerignore — exclude site/, scripts/, .hypothesis/, web/__tests__/, web/vitest.config.*
  • Increase backend healthcheck start-period — 15s → 30s for Mem0/Qdrant init headroom
  • Add image/svg+xml to nginx gzip_types
  • Mirror all hardening in CLI compose template + golden test files
  • Update docsdocs/security.md CIS table (5.28, nodev, resource limits, CIS scan), CLAUDE.md CI description

Test plan

  • cd cli && go test ./... — all pass (compose golden files verified)
  • docker build backend + web images locally — both succeed
  • trivy image --compliance docker-cis-1.6.0 — backend all PASS, web PASS except 4.4 (upstream zlib CVE in Alpine, awaiting nginx base image rebuild)
  • npm ci --ignore-scripts + npm run build — web build succeeds with all pure-JS deps
  • gzip pre-compression produces .gz files alongside originals in dist/
  • Pre-commit hooks pass (hadolint, gitleaks, trailing whitespace, commitizen)
  • Pre-push hooks pass (mypy, pytest, go vet, go test)

Pre-reviewed by 2 agents (docs-consistency, infra-reviewer), 10 findings addressed.

…rdening

- Fix healthcheck array-form test to start with "CMD" (required by Compose)
- Merge two sequential Trivy scan steps into single JSON-based run (~25-30s saving)
- Add resource limits (4G/2CPU backend, 256M/0.5CPU web), pids_limit, nodev on
  tmpfs, log rotation, and named network to compose
- Pre-compress static assets at build time (gzip -9) with gzip_static on in nginx
- Add --ignore-scripts to npm ci for supply chain hardening
- Consolidate multi-RUN layers in backend setup and sandbox stages
- Add OCI labels to sandbox Dockerfile
- Add CIS Docker Benchmark compliance scan (Trivy, informational) to CI
- Tighten .dockerignore: exclude site/, scripts/, .hypothesis/, web test infra
- Increase backend healthcheck start-period from 15s to 30s for Mem0 init headroom
- Add image/svg+xml to nginx gzip_types
- Mirror hardening improvements in CLI compose template + golden files
- Use env var pattern for image ref in CIS scan steps (injection defense)
- Add trivy PATH guard to CIS steps (prevents silent skip masking)
- Sync start_period 15s→30s in CLI template + golden files (matches Dockerfile)
- Add clean-results UX to Trivy eval (no lonely header on zero vulns)
- Add healthcheck provenance comment to docker/compose.yml
- Add --ignore-scripts trade-off comment to web Dockerfile
- Update docs/security.md: CIS 5.28, nodev, resource limits, CIS scan
- Update CLAUDE.md: mention CIS compliance scan in Docker CI description

Pre-reviewed by 2 agents, 10 findings addressed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 15, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails
actions/actions/upload-artifact bbbca2ddaa5d8feaa63e36b76fdaad77386f024f 🟢 6.2
Details
CheckScoreReason
Maintained🟢 1028 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 10
Binary-Artifacts🟢 10no binaries found in the repo
Code-Review🟢 10all changesets reviewed
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Packaging⚠️ -1packaging workflow not detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies⚠️ 1dependency not pinned by hash detected -- score normalized to 1
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Security-Policy🟢 9security policy file detected
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST🟢 10SAST tool is run on all commits

Scanned Files

  • .github/workflows/docker.yml

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 15, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Serve pre-compressed static assets, expanded MIME coverage, and improved gzip handling
    • Consolidated vulnerability reporting in CI with unified JSON reports, per-context artifacts, and added CIS Docker benchmark checks
  • Chores

    • Container hardening: resource limits, pids caps, tmpfs nodev, logging and healthcheck improvements
    • Streamlined image builds and metadata, reduced build layers, and adjusted npm install behavior
    • Updated ignore rules to reduce build/test noise
  • Documentation

    • Clarified security guidance and scanning behavior in docs

Walkthrough

Consolidates ignore rules; hardens compose templates and Dockerfiles (tmpfs nodev, pids_limit, resource limits, logging, merged RUNs); updates CI to emit Trivy JSON, add evaluation step and CIS Docker Benchmark checks; enables nginx gzip_static and gzips large web assets.

Changes

Cohort / File(s) Summary
Ignore & Build Context
\.dockerignore
Simplify site ignores to site/; add .hypothesis/, scripts/, and extend web test/config ignore patterns.
CI Vulnerability Scanning
.github/workflows/docker.yml, CLAUDE.md
Run Trivy with JSON output, add evaluation step that fails on CRITICAL and warns on HIGH, upload per-context JSON reports, and run CIS Docker Benchmark checks (informational/continue-on-error).
Compose Templates & Testdata
cli/internal/compose/compose.yml.tmpl, cli/testdata/compose_custom_ports.yml, cli/testdata/compose_default.yml, docker/compose.yml
Add pids_limit per service, add nodev to tmpfs mounts, increase deploy.resources.limits (memory/cpus), add centralized json-file logging with rotation, convert healthcheck to CMD form and lengthen start_period, add default network synthorg-net.
Dockerfiles
docker/backend/Dockerfile, docker/sandbox/Dockerfile, docker/web/Dockerfile
Consolidate multiple RUN layers into single RUNs, adjust backend healthcheck start_period to 30s, add OCI LABELs (sandbox), run npm ci --ignore-scripts, and gzip large web build assets post-build.
Web Server & Docs
web/nginx.conf, docs/security.md
Enable gzip_static and gzip_vary, include image/svg+xml in gzip_types, and document CIS additions (nodev tmpfs, pids_limit), resource limits, and log rotation guidance.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant GH as GitHub Actions
participant Trivy as Trivy Scanner
participant CIS as CIS Benchmark
participant Eval as Evaluation Step
participant Registry as Container Registry

GH->>Trivy: run Trivy (JSON) for backend and web
GH->>CIS: execute CIS Docker Benchmark checks
Trivy-->>GH: produce trivy-backend.json / trivy-web.json
GH->>Eval: invoke evaluation on JSON results
Eval-->>GH: write summary table and counts
alt CRITICAL found
    Eval-->>GH: fail workflow (prevent push)
else only HIGH or none
    Eval-->>GH: warn or continue
end
GH->>Registry: push/sign/attest (only if evaluation passes)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: Docker healthcheck fix, CI optimization with Trivy scans, and container hardening with resource limits and security improvements.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering all major modifications including healthcheck fixes, CI optimization, CIS scanning, container hardening, asset compression, and documentation updates.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/docker-healthcheck-and-ci
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch fix/docker-healthcheck-and-ci
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the project's Docker infrastructure by significantly improving security posture, optimizing CI/CD workflows, and refining container configurations. It incorporates several best practices for container hardening, streamlines build processes, and updates relevant documentation to accurately reflect these changes, ultimately leading to more secure, efficient, and robust deployments.

Highlights

  • Docker Healthcheck Fix: Corrected Docker Compose array-form healthcheck test to explicitly start with "CMD", ensuring proper execution.
  • CI Optimization: Optimized Trivy CI scans by merging sequential steps into a single JSON-based run, failing only on CRITICAL issues and warning on HIGH, saving approximately 25-30 seconds per image.
  • CIS Docker Benchmark Scanning: Integrated trivy image --compliance docker-cis-1.6.0 scanning for both backend and web images as an informational step in the CI pipeline.
  • Docker Compose Hardening: Applied comprehensive hardening to Docker Compose configurations, including resource limits (4G/2CPU for backend, 256M/0.5CPU for web), pids_limit (256/64), nodev on all tmpfs mounts, and log rotation (max-size: 10m, max-file: 3).
  • Static Asset Pre-compression: Implemented build-time gzip -9 -k for web static assets in the Dockerfile and enabled gzip_static on in Nginx, resulting in zero CPU usage at serve time and better compression ratios.
  • Supply Chain Hardening: Enhanced supply chain security by using npm ci --ignore-scripts in the web builder stage to prevent arbitrary script execution during dependency installation.
  • Dockerfile Layer Consolidation: Consolidated multiple RUN commands into single layers in both the backend and sandbox Dockerfiles to improve build efficiency and reduce image size.
  • OCI Labels Added: Added Open Container Initiative (OCI) labels to the sandbox Dockerfile for better image metadata and discoverability.
  • Dockerignore Tightening: Expanded .dockerignore to exclude unnecessary files and directories such as site/, scripts/, .hypothesis/, web/__tests__/, and web/vitest.config.* from Docker builds.
  • Backend Healthcheck Start Period: Increased the backend healthcheck start_period from 15s to 30s to provide more headroom for Mem0/Qdrant initialization.
  • Nginx Gzip Types Update: Added image/svg+xml to the Nginx gzip_types configuration to ensure SVG images are also compressed.
  • Hardening Mirroring: Ensured all hardening changes, including resource limits and healthcheck adjustments, are consistently reflected in CLI compose templates and golden test files.
  • Documentation Updates: Updated docs/security.md to reflect new CIS table entries (5.28, nodev, resource limits, CIS scan) and CLAUDE.md with the updated CI description.
Changelog
  • .dockerignore
    • Excluded the entire site/ directory from Docker builds.
    • Added .hypothesis/ to the list of ignored files.
    • Excluded the scripts/ directory from Docker builds.
    • Added web/__tests__/ and web/vitest.config.* to the list of ignored files.
  • CLAUDE.md
    • Updated the description of the Docker CI workflow to include CIS Docker Benchmark v1.6.0 compliance scanning.
  • cli/internal/compose/compose.yml.tmpl
    • Updated CIS Docker Benchmark hardening comment to include 5.28.
    • Added nodev option to /tmp tmpfs mount for backend.
    • Set pids_limit to 256 for the backend service.
    • Increased backend memory limit to 4G and CPU limit to 2.0.
    • Configured JSON file logging with max-size and max-file options for backend.
    • Prepended "CMD" to the backend healthcheck test command.
    • Increased backend healthcheck start_period to 30s.
    • Added nodev option to tmpfs mounts for the web service.
    • Set pids_limit to 64 for the web service.
    • Increased web memory limit to 256M.
    • Configured JSON file logging with max-size and max-file options for web.
    • Added nodev option to tmpfs mount for the sandbox service.
    • Set pids_limit to 128 for the sandbox service.
    • Configured JSON file logging with max-size and max-file options for sandbox.
  • cli/testdata/compose_custom_ports.yml
    • Mirrored Docker Compose hardening, resource limits, log rotation, and healthcheck format changes for backend and web services.
  • cli/testdata/compose_default.yml
    • Mirrored Docker Compose hardening, resource limits, log rotation, and healthcheck format changes for backend and web services.
  • docker/backend/Dockerfile
    • Consolidated multiple RUN commands into a single layer for venv symlink fix, data directory creation, and ownership setting.
    • Increased HEALTHCHECK start-period from 15s to 30s.
  • docker/compose.yml
    • Added nodev to /tmp tmpfs mount for backend.
    • Set pids_limit to 256 for backend.
    • Added deploy.resources.limits for backend (4G memory, 2.0 cpus).
    • Configured logging for backend.
    • Added nodev to tmpfs mounts for web.
    • Set pids_limit to 64 for web.
    • Added deploy.resources.limits for web (256M memory, 0.5 cpus).
    • Configured logging for web.
    • Defined a named network synthorg-net.
  • docker/sandbox/Dockerfile
    • Added OCI image labels for title, description, URL, source, licenses, and vendor.
    • Consolidated RUN commands for npm symlink, git installation, and user creation into a single layer.
  • docker/web/Dockerfile
    • Modified npm ci command to include --ignore-scripts for supply chain hardening.
    • Added a find command to pre-compress static assets (.js, .css, .html, .svg, .json) using gzip -9 -k after npm run build.
  • docs/security.md
    • Updated CIS 5.25 description to include nodev for tmpfs mounts.
    • Added CIS 5.28 entry for pids_limit.
    • Added a paragraph detailing resource limits and log rotation configurations.
    • Included CIS Docker Benchmark scanning in the list of image scanning tools.
  • web/nginx.conf
    • Enabled gzip_static to serve pre-compressed files.
    • Extended gzip_types to include image/svg+xml.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker.yml
Activity
  • The pull request was pre-reviewed by two agents (docs-consistency, infra-reviewer).
  • Ten findings identified during pre-review were addressed by the author.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 15, 2026 10:55 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive set of improvements focused on security hardening, CI optimization, and container configuration. The changes include fixing the Docker healthcheck command, adding CIS benchmark scanning, hardening the Docker Compose configurations with resource limits and security options, and optimizing the web image build with pre-compressed static assets. The Dockerfiles have also been improved by consolidating layers. Overall, these are excellent changes that significantly enhance the security posture and efficiency of the application's containerization. I have one suggestion to reduce duplication in the Docker Compose files for better maintainability.

Comment on lines +36 to +40
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and reduce duplication, consider using YAML anchors for common configurations like logging. This would make the compose file cleaner and easier to update in the future. You could define a reusable block at the top level and reference it in each service.

This principle also applies to cli/internal/compose/compose.yml.tmpl.

For example:

x-logging: &logging
  driver: json-file
  options:
    max-size: "10m"
    max-file: "3"

services:
  backend:
    ...
    logging: *logging
    ...
  web:
    ...
    logging: *logging
    ...

This same pattern could be applied to other duplicated sections like security_opt, cap_drop, etc.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cli/internal/compose/compose.yml.tmpl (1)

108-110: ⚠️ Potential issue | 🟠 Major

Missing named network in CLI template causes compose parity drift.

Line 108 moves directly to volumes: without defining networks.default.name, so CLI-generated compose files won’t include synthorg-net while docker/compose.yml does. That breaks expected alignment between local and generated deployments.

🔧 Proposed fix
+networks:
+  default:
+    name: synthorg-net
+
 volumes:
   synthorg-data:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cli/internal/compose/compose.yml.tmpl` around lines 108 - 110, The compose
template is missing the named network so generated CLI compose files drift from
docker/compose.yml; update the template to declare networks.default.name with
the expected network (synthorg-net) before the volumes block—i.e., add a
networks section setting default.name to "synthorg-net" (referencing the
template's networks.default.name and the expected synthorg-net) so CLI-generated
compose files and docker/compose.yml remain in parity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cli/internal/compose/compose.yml.tmpl`:
- Around line 108-110: The compose template is missing the named network so
generated CLI compose files drift from docker/compose.yml; update the template
to declare networks.default.name with the expected network (synthorg-net) before
the volumes block—i.e., add a networks section setting default.name to
"synthorg-net" (referencing the template's networks.default.name and the
expected synthorg-net) so CLI-generated compose files and docker/compose.yml
remain in parity.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 77ee68de-22f9-4889-bbd7-b6cafcfeffcb

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4b214 and 9120ee5.

📒 Files selected for processing (12)
  • .dockerignore
  • .github/workflows/docker.yml
  • CLAUDE.md
  • cli/internal/compose/compose.yml.tmpl
  • cli/testdata/compose_custom_ports.yml
  • cli/testdata/compose_default.yml
  • docker/backend/Dockerfile
  • docker/compose.yml
  • docker/sandbox/Dockerfile
  • docker/web/Dockerfile
  • docs/security.md
  • web/nginx.conf
📜 Review details
🔇 Additional comments (25)
.dockerignore (4)

26-27: Good consolidation of Astro ignore rules.

Ignoring site/ at the root is cleaner and less error-prone than maintaining multiple nested entries.


40-42: Good addition of Hypothesis cache exclusion.

Excluding .hypothesis/ helps keep Docker build context smaller and avoids test-state leakage into images.


61-63: scripts/ exclusion aligns with hardening goals.

This is a solid build-context reduction step and matches the documented PR intent.


78-84: Web test-infra ignore additions look correct.

Ignoring web/__tests__/ and web/vitest.config.* is appropriate for production Docker builds and keeps context focused on runtime assets.

web/nginx.conf (1)

14-18: LGTM!

The gzip_static on directive correctly enables serving pre-compressed .gz files produced by the Vite build stage, with automatic fallback to on-the-fly compression. Adding image/svg+xml to gzip_types is appropriate since SVGs compress well and are commonly used in web dashboards.

docker/backend/Dockerfile (2)

60-64: LGTM!

The consolidated RUN layer correctly combines symlink fix, directory creation, and ownership assignment into a single layer, reducing image size. Command chaining with && ensures proper failure propagation.


112-113: LGTM!

Increasing start_period from 15s to 30s provides appropriate headroom for Mem0/Qdrant initialization during container startup.

docker/web/Dockerfile (2)

16-18: LGTM!

The --ignore-scripts flag is a solid supply chain hardening measure that prevents execution of potentially malicious postinstall scripts. The comment clearly documents the trade-off and provides guidance for handling future dependencies that require postinstall hooks.


20-21: LGTM!

Pre-compressing assets at build time with gzip -9 -k is efficient — it offloads compression from runtime to build, pairs well with gzip_static on in nginx, and the 256-byte threshold appropriately skips tiny files where compression overhead exceeds savings.

docker/sandbox/Dockerfile (2)

5-10: LGTM!

OCI labels follow the standard image-spec annotations, providing consistent metadata across all project images.


15-23: LGTM!

Layer consolidation combines npm symlink, git installation, apt cleanup, and user setup into a single RUN, reducing image size. The sandbox user is correctly configured with --no-create-home and --shell /usr/sbin/nologin for security.

.github/workflows/docker.yml (3)

125-149: LGTM!

The consolidated Trivy evaluation logic correctly:

  • Handles empty results with a clean message
  • Formats a readable table using jq and column
  • Separates concern: CRITICAL fails the build, HIGH emits warnings only
  • Uses proper exit codes for CI integration

159-167: LGTM!

The CIS Docker Benchmark scan is well-structured:

  • IMAGE_REF env var avoids shell injection risks
  • command -v trivy provides a defensive fallback (though trivy-action should install it)
  • continue-on-error: true is appropriate for establishing an informational baseline

268-319: LGTM!

The web image scanning follows the same pattern as backend, ensuring consistent vulnerability evaluation and CIS compliance checking across both images.

CLAUDE.md (1)

260-260: LGTM!

Documentation accurately reflects the updated Docker workflow with CIS Docker Benchmark compliance scanning.

docs/security.md (2)

114-117: LGTM!

Documentation accurately reflects the compose.yml hardening changes: nodev added to tmpfs mounts (CIS 5.25), pids_limit per container (CIS 5.28), and resource limits with log rotation.


150-152: LGTM!

Documentation correctly adds CIS Docker Benchmark scanning as an informational check and updates the image push policy to reference "vulnerability scanners" (Trivy + Grype).

cli/testdata/compose_default.yml (3)

38-43: LGTM!

The healthcheck correctly uses Docker Compose array form starting with "CMD", and the start_period: 30s provides adequate initialization time for Mem0/Qdrant.


24-37: LGTM!

Backend hardening is properly configured:

  • tmpfs with nodev (CIS 5.25)
  • pids_limit: 256 (CIS 5.28)
  • Resource limits (4G memory, 2 CPUs)
  • JSON log rotation (10m × 3 files)

58-73: LGTM!

Web service hardening mirrors backend configuration with appropriate resource scaling (256M memory, 0.5 CPU, pids_limit 64) and consistent log rotation settings.

cli/internal/compose/compose.yml.tmpl (2)

21-46: Backend hardening and healthcheck update look correct.

The nodev tmpfs, pids_limit, resource caps, log rotation, and CMD-array healthcheck with longer startup headroom are consistent and production-sane.


62-105: Web and sandbox hardening changes are consistent and well-applied.

tmpfs hardening, pids_limit, resource ceilings, and log rotation are mirrored cleanly across services.

cli/testdata/compose_custom_ports.yml (1)

19-74: Golden file updates are aligned with the new hardening baseline.

The expected output reflects the revised CIS note, nodev tmpfs entries, pids_limit, resource limits, log rotation, and CMD-array healthcheck/start period as intended.

docker/compose.yml (2)

19-40: Backend service hardening is solid.

nodev tmpfs, pids_limit, resource limits, and log rotation are strong defense-in-depth additions and look correctly structured.


57-75: Web hardening and explicit network naming look good.

Web runtime constraints/logging are consistent, and naming the default network makes connectivity behavior explicit for this compose stack.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 15, 2026

Greptile Summary

This PR is a broad infrastructure hardening pass covering Docker healthcheck correctness, CI scan consolidation, container resource governance, static asset pre-compression, and supply-chain tightening — all changes are mirrored consistently across docker/compose.yml, the CLI Compose template, and the golden test files.

Key changes:

  • Healthcheck CMD prefix fix — adds the required "CMD" first element to all Docker Compose array-form test lists; without it Docker silently skips the healthcheck entirely
  • Trivy CI optimization — merges two sequential scan steps into one JSON-based run with parsed CRITICAL/HIGH counts, step-summary output, and artifact upload; saves ~25–30s per image
  • Container hardeningpids_limit, nodev on all tmpfs mounts, resource limits (4G/2CPU backend, 256M/0.5CPU web), log rotation, and a named synthorg-net network added uniformly
  • Gzip pre-compression — build-time gzip -9 -k in the web Dockerfile + gzip_static on in nginx eliminates runtime CPU cost for static assets
  • Supply-chainnpm ci --ignore-scripts in the web builder stage; sandbox and backend layers consolidated
  • One logic concern — the Grype scan step (and CIS benchmark) in both CI jobs now has an implicit if: success() dependency on Evaluate Trivy results. When Trivy finds CRITICAL CVEs and exits 1, Grype is silently skipped — breaking the "two independent scanners" guarantee. Adding if: always() to the Grype step restores independent coverage.

Confidence Score: 4/5

  • Safe to merge with one follow-up: restore Grype's independent execution path when Trivy finds CRITICAL vulnerabilities.
  • All hardening changes are correct and consistently mirrored across every compose variant and golden file. The healthcheck CMD prefix fix is a genuine bug fix. The single actionable concern — Grype being skipped on Trivy CRITICAL failure — reduces security observability but does not weaken the gate (the build still fails); it is easy to fix with one line.
  • .github/workflows/docker.yml — both build-backend and build-web jobs need if: always() on the Grype scan step.

Important Files Changed

Filename Overview
.github/workflows/docker.yml Trivy scans consolidated into a single JSON-based run with step-summary output and artifact upload — well-structured, but the Grype scan step now has an implicit if: success() dependency on Trivy evaluation, silently skipping the second independent scanner when CRITICAL CVEs are found.
docker/compose.yml Adds resource limits, pids_limit, nodev on all tmpfs mounts, log rotation via x-logging YAML anchor, explicit user fields, and a named network — all consistent with the CLI template hardening.
docker/web/Dockerfile Adds npm ci --ignore-scripts for supply-chain hardening and build-time gzip pre-compression; the find -exec {} ; form spawns one gzip process per file — using {} + would be more efficient.
web/nginx.conf Adds gzip_static on to serve pre-compressed assets, gzip_vary on for correct proxy/CDN cache behaviour, and image/svg+xml to gzip_types — all consistent with the Dockerfile pre-compression step.
cli/internal/compose/compose.yml.tmpl Mirrors all hardening from docker/compose.yml — CMD prefix fix, nodev, pids_limit, resource limits, log rotation, named network — and golden test files are updated accordingly.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: .github/workflows/docker.yml
Line: 167-173

Comment:
**Grype scan skipped when Trivy finds CRITICAL vulnerabilities**

In the old design, the `Trivy scan (high — warn only)` step had `continue-on-error: true`, which kept the job in a "success" state so that Grype always ran independently. In the new design, `Evaluate Trivy results` calls `exit 1` when CRITICAL vulnerabilities are found. Because `Grype scan` has no `if` condition it defaults to `if: success()` — so it is skipped entirely whenever Trivy flags a CRITICAL CVE.

This silently breaks the "two independent scanners" guarantee described in both `CLAUDE.md` and `docs/security.md`. If the Grype database contains a CRITICAL CVE that Trivy misses, that finding is only surfaced when Trivy's own scan is clean, which defeats the purpose of running two independent tools.

Adding `if: always()` to the Grype step (and, optionally, the CIS benchmark step) in both jobs restores independent coverage:

```suggestion
      - name: Grype scan
        if: always()
        uses: anchore/scan-action@7037fa011853d5a11690026fb85feee79f4c946c # v7.3.2
        with:
          image: ${{ steps.scan-ref.outputs.ref }}
          fail-build: true
          severity-cutoff: critical
          config: .github/.grype.yaml
```

The same applies to the `Grype scan` step in the `build-web` job (around line 337).

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: docker/web/Dockerfile
Line: 20-21

Comment:
**Gzip pre-compression spawns one process per file**

Using `\;` invokes `gzip` once per matched file. Since `gzip` accepts multiple file arguments natively, replacing `\;` with `+` lets `find` batch all paths into far fewer invocations — materially faster for any non-trivial `dist/` tree and avoids process-spawn overhead in the Docker build layer:

```suggestion
    && find /app/dist -type f \( -name '*.js' -o -name '*.css' -o -name '*.html' -o -name '*.svg' -o -name '*.json' \) -size +256c -exec gzip -9 -k {} +
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: c2eb02a

Comment on lines +161 to +167
- name: CIS Docker Benchmark (backend)
continue-on-error: true
env:
IMAGE_REF: ${{ steps.scan-ref.outputs.ref }}
run: |
command -v trivy >/dev/null 2>&1 || { echo "::warning::trivy not on PATH, skipping CIS scan"; exit 0; }
trivy image --compliance docker-cis-1.6.0 --format table "$IMAGE_REF"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CIS benchmark steps may silently skip on certain runner configurations

The step guards with command -v trivy and emits a ::warning:: if trivy isn't on PATH. Whether trivy is available depends on the aquasecurity/trivy-action implementation detail — recent versions install the binary directly on the runner host, but this isn't guaranteed across all runner environments or action versions.

Since the step is already continue-on-error: true (informational), the silent skip is low-risk. However, the same pattern also appears at the web job (line ~318). If you want certainty that the CIS scan always runs, consider using a second aquasecurity/trivy-action invocation with scan-type: image and appropriate flags instead of relying on the host trivy binary:

- name: CIS Docker Benchmark (backend)
  continue-on-error: true
  uses: aquasecurity/trivy-action@57a97c7e7821a5776cebc9bb87c984fa69cba8f1 # v0.35.0
  with:
    image-ref: ${{ steps.scan-ref.outputs.ref }}
    scan-type: image
    format: table
    exit-code: "0"
    compliance: docker-cis-1.6.0
Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/docker.yml
Line: 161-167

Comment:
**CIS benchmark steps may silently skip on certain runner configurations**

The step guards with `command -v trivy` and emits a `::warning::` if trivy isn't on `PATH`. Whether trivy is available depends on the `aquasecurity/trivy-action` implementation detail — recent versions install the binary directly on the runner host, but this isn't guaranteed across all runner environments or action versions.

Since the step is already `continue-on-error: true` (informational), the silent skip is low-risk. However, the same pattern also appears at the web job (line ~318). If you want certainty that the CIS scan always runs, consider using a second `aquasecurity/trivy-action` invocation with `scan-type: image` and appropriate flags instead of relying on the host `trivy` binary:

```yaml
- name: CIS Docker Benchmark (backend)
  continue-on-error: true
  uses: aquasecurity/trivy-action@57a97c7e7821a5776cebc9bb87c984fa69cba8f1 # v0.35.0
  with:
    image-ref: ${{ steps.scan-ref.outputs.ref }}
    scan-type: image
    format: table
    exit-code: "0"
    compliance: docker-cis-1.6.0
```

How can I resolve this? If you propose a fix, please make it concise.

Repository owner deleted a comment from coderabbitai bot Mar 15, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/docker.yml (1)

268-319: 🧹 Nitpick | 🔵 Trivial

Web job Trivy/CIS changes mirror backend correctly.

The evaluation logic is consistent between jobs, which is good for maintainability.

Consider extracting the Trivy scan + evaluate + CIS steps into a reusable workflow to reduce duplication. This would make future changes easier to maintain across both images.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 268 - 319, Create a reusable
workflow that encapsulates the "Trivy scan", "Evaluate Trivy results", "Grype
scan", and "CIS Docker Benchmark (web)" steps so both web and backend jobs can
call it; move those steps into a new workflow (e.g., trivy-grype-scan.yml) with
workflow_call inputs for image-ref (image_ref), trivyignores path, grype config
path, severity list, and a flag for CIS run, implement outputs for CRITICAL/HIGH
counts if needed, and preserve existing behavior (exit codes, continue-on-error
for CIS, jq evaluation logic in the Evaluate Trivy results step). Then replace
the duplicated step blocks in each job with a single uses:
./.github/workflows/trivy-grype-scan.yml@<branch> call, mapping inputs
(image-ref: ${{ steps.scan-ref.outputs.ref }}, trivyignores:
.github/.trivyignore.yaml, config: .github/.grype.yaml, etc.) so both pipelines
use the same centralized logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/nginx.conf`:
- Around line 16-19: The nginx gzip configuration is missing gzip_vary, so add
the directive gzip_vary on; alongside the existing gzip/gzip_static directives
(next to gzip_static, gzip, gzip_types, gzip_min_length) to ensure responses
include the Vary: Accept-Encoding header and avoid cache mismatches between
compressed and uncompressed assets.

---

Outside diff comments:
In @.github/workflows/docker.yml:
- Around line 268-319: Create a reusable workflow that encapsulates the "Trivy
scan", "Evaluate Trivy results", "Grype scan", and "CIS Docker Benchmark (web)"
steps so both web and backend jobs can call it; move those steps into a new
workflow (e.g., trivy-grype-scan.yml) with workflow_call inputs for image-ref
(image_ref), trivyignores path, grype config path, severity list, and a flag for
CIS run, implement outputs for CRITICAL/HIGH counts if needed, and preserve
existing behavior (exit codes, continue-on-error for CIS, jq evaluation logic in
the Evaluate Trivy results step). Then replace the duplicated step blocks in
each job with a single uses: ./.github/workflows/trivy-grype-scan.yml@<branch>
call, mapping inputs (image-ref: ${{ steps.scan-ref.outputs.ref }},
trivyignores: .github/.trivyignore.yaml, config: .github/.grype.yaml, etc.) so
both pipelines use the same centralized logic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8b7acb21-ed6f-46e0-9af9-fb7fb67500c8

📥 Commits

Reviewing files that changed from the base of the PR and between 7d4b214 and 9120ee5.

📒 Files selected for processing (12)
  • .dockerignore
  • .github/workflows/docker.yml
  • CLAUDE.md
  • cli/internal/compose/compose.yml.tmpl
  • cli/testdata/compose_custom_ports.yml
  • cli/testdata/compose_default.yml
  • docker/backend/Dockerfile
  • docker/compose.yml
  • docker/sandbox/Dockerfile
  • docker/web/Dockerfile
  • docs/security.md
  • web/nginx.conf
📜 Review details
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-15T01:53:24.085Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no 'out of scope' skipping.
🔇 Additional comments (21)
docker/sandbox/Dockerfile (2)

5-10: OCI metadata block is clean and complete.

Good provenance labeling coverage for downstream registry and SBOM tooling.


15-23: Consolidated setup layer is solid.

Single-layer setup keeps the image lean while preserving hardening steps.

CLAUDE.md (1)

260-260: CI security scan description is clear and accurate.

The wording cleanly captures enforcement vs informational scan behavior.

docker/backend/Dockerfile (2)

60-64: Setup-stage RUN merge looks correct.

Symlink correction, data dir creation, and ownership handling are coherently applied in one layer.


112-112: Healthcheck startup window adjustment is a good reliability tweak.

This is a sensible buffer for cold starts without weakening checks.

docker/web/Dockerfile (2)

16-18: npm ci --ignore-scripts hardening is well-executed.

Good supply-chain posture, and the inline guidance for future exceptions is practical.


20-21: Build-time pre-compression pipeline is a strong performance addition.

The implementation is straightforward and aligns with runtime serving strategy.

docs/security.md (2)

114-117: CIS and runtime-limits documentation update is strong.

The additions clearly communicate the new hardening baseline.


150-153: Scanner policy wording is well-scoped.

Good distinction between informational CIS checks and blocking vulnerability gates.

cli/testdata/compose_custom_ports.yml (2)

19-44: Backend compose golden updates look correct.

The healthcheck CMD-array fix and hardening additions are represented consistently.


60-74: Web compose golden hardening block is consistent.

tmpfs/pids_limit/logging/resource updates are coherent with the intended baseline.

.dockerignore (1)

26-27: .dockerignore tightening is beneficial and safe.

Nice reduction in context noise for faster, cleaner, and lower-risk image builds.

Also applies to: 40-42, 61-63, 78-83

cli/internal/compose/compose.yml.tmpl (2)

21-46: LGTM! CIS hardening and healthcheck improvements are well-implemented.

The backend service hardening is comprehensive:

  • CIS 5.28 (pids_limit) added to comment and enforced
  • nodev added to tmpfs mounts
  • Resource limits properly set
  • Logging rotation configured
  • Healthcheck correctly uses CMD array form with increased start_period

48-106: LGTM! Web and sandbox hardening is appropriately configured.

The differentiated tmpfs options for web are correct—/var/cache/nginx and /var/run need write access for nginx operation while still benefiting from nodev. Resource limits and logging are consistent across all services.

.github/workflows/docker.yml (2)

113-149: LGTM! Trivy consolidation is well-designed.

The single JSON-based scan with post-processing is a smart optimization:

  • Clean separation between scan (always succeeds) and evaluation (fails on CRITICAL)
  • Table formatting provides readable output in logs
  • Warning annotations for HIGH vulnerabilities without failing the build

The $'\t' ANSI-C quoting works correctly since GitHub Actions uses bash by default on ubuntu runners.


159-167: LGTM! CIS Docker Benchmark check is properly configured.

Good defensive check for trivy availability. The continue-on-error: true is appropriate for initial rollout—allows establishing a baseline before enforcing compliance.

cli/testdata/compose_default.yml (2)

18-43: LGTM! Golden test file correctly reflects template changes.

The backend service configuration matches the template output with default values applied. The CMD array form healthcheck and all hardening options are correctly reflected.


58-73: LGTM! Web service golden test matches template.

Tmpfs options, pids_limit, resource limits, and logging configuration are consistent with the template.

docker/compose.yml (3)

19-40: LGTM! Backend hardening configuration is comprehensive.

The inline comment clarifying healthcheck behavior (Dockerfile-defined vs CLI template override) is helpful. Resource limits, pids_limit, and logging are consistent with other compose files.


56-75: LGTM! Web hardening and named network are well-configured.

The synthorg-net named network provides better isolation for local development. The differentiated tmpfs options for nginx paths are correctly preserved.


1-40: Verify USER directives in Dockerfiles and consider adding user directives for consistency.

The CLI template and testdata specify user: "65532:65532" for backend and user: "101:101" for web, but this compose file omits them. Confirm whether docker/backend/Dockerfile and docker/web/Dockerfile define USER directives. If they don't, add the corresponding user directives to compose.yml to prevent containers from running as root during local development.

- Add networks.default.name (synthorg-net) to CLI template + golden files
  to match docker/compose.yml (CodeRabbit outside-diff finding)
- Add user: directives to docker/compose.yml for defense-in-depth parity
  with CLI template (infra-agent finding)
- Use YAML anchor (x-logging) to deduplicate logging config in
  docker/compose.yml (Gemini suggestion)
- Add comment documenting port 8000 assumption in CLI healthcheck
  (infra-agent finding)
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 15, 2026 11:09 — with GitHub Actions Inactive
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/compose.yml`:
- Around line 60-64: Update the Docker Compose tmpfs entries so that the
'/var/cache/nginx' and '/var/run' mounts include the same mount options as
'/tmp' for consistency and hardening: add "noexec,nosuid" (so they become
"noexec,nosuid,nodev"). Locate the tmpfs block showing the three entries (the
lines containing '/tmp', '/var/cache/nginx', and '/var/run') and append the
missing options to the latter two entries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 84e5080b-d1d3-4e95-85f4-b6362db71e4e

📥 Commits

Reviewing files that changed from the base of the PR and between 9120ee5 and 5b3202f.

📒 Files selected for processing (4)
  • cli/internal/compose/compose.yml.tmpl
  • cli/testdata/compose_custom_ports.yml
  • cli/testdata/compose_default.yml
  • docker/compose.yml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Greptile Review
  • GitHub Check: Analyze (python)
🔇 Additional comments (13)
docker/compose.yml (3)

1-5: Good use of YAML anchor for logging configuration.

The x-logging anchor effectively deduplicates the logging configuration across services, following DRY principles.


25-43: Backend hardening looks solid.

The CIS Docker Benchmark hardening additions are well-implemented:

  • User 65532:65532 (distroless nonroot) is appropriate
  • pids_limit: 256 prevents fork bombs
  • Resource limits (4G memory, 2 CPUs) are reasonable for a backend service
  • nodev added to tmpfs mount options
  • Logging anchor reuse is clean

The comment clarifying reliance on the Dockerfile-level healthcheck is helpful for maintainability.


73-75: Named network improves container isolation and management.

Using an explicit named network synthorg-net instead of the auto-generated default network is good practice for:

  • Easier identification in multi-project environments
  • Consistent network naming across deployments
  • Better tooling/debugging support
cli/testdata/compose_default.yml (3)

38-46: Healthcheck comment and implementation are well-documented.

The comment clearly explains why port 8000 is hardcoded (exec form cannot expand env vars). The start_period: 30s increase accommodates slower container startups.

One observation: the Python one-liner healthcheck is quite dense. While functional, if this needs modification in the future, consider extracting it to a small script.


62-65: Web tmpfs options consistent with docker/compose.yml.

The nodev addition aligns with the CIS hardening goal. Same observation as docker/compose.yml regarding potential noexec,nosuid additions for /var/cache/nginx and /var/run.


33-37: Logging configuration correctly duplicated in generated output.

Unlike docker/compose.yml which uses a YAML anchor, the generated test data appropriately has the logging blocks expanded inline. This is expected behavior for CLI-generated compose files.

Also applies to: 72-76

cli/testdata/compose_custom_ports.yml (3)

19-47: Backend hardening consistent with default compose test data.

All CIS hardening elements (tmpfs with nodev, pids_limit, resource limits, logging, healthcheck) are correctly applied. The custom port mapping (9000:8000) correctly leaves the internal container port unchanged, so the healthcheck's hardcoded port 8000 remains valid.


63-77: Web service hardening consistent across test fixtures.

The tmpfs options, pids_limit, resource limits, and logging configuration match the default compose test data, ensuring template consistency is testable.


79-81: Network configuration consistent.

The synthorg-net named network is present, matching other compose files.

cli/internal/compose/compose.yml.tmpl (4)

21-49: Backend template correctly implements all hardening measures.

The template applies:

  • CIS benchmark reference comment (5.3, 5.12, 5.25, 5.28)
  • nodev on tmpfs
  • pids_limit: 256
  • Resource limits (4G/2.0 CPU)
  • json-file logging with rotation
  • CMD-form healthcheck with documented port assumption

The healthcheck start_period increase to 30s is appropriate for cold starts with potential database initialization.


65-79: Web service template consistent with docker/compose.yml.

The tmpfs mounts, pids_limit, resource limits, and logging are correctly templated. The web service intentionally omits a healthcheck, relying on the backend's health status via depends_on: condition: service_healthy.


82-108: Sandbox service hardening is appropriately configured.

The sandbox service has:

  • Read-only Docker socket mount (:ro) - prevents container escape via socket manipulation
  • pids_limit: 128 - appropriate for orchestrating child containers
  • 256M memory limit with 0.5 CPU - reasonable for a management sidecar
  • nodev on tmpfs

The larger tmpfs size (128m vs web's 16m) is justified given the sandbox may need to stage files for container operations.


111-114: Network section correctly placed outside conditional block.

The synthorg-net network definition is at the top level, ensuring it's always present regardless of the .Sandbox conditional. This is correct.

…vary

- Upload trivy-backend.json and trivy-web.json as workflow artifacts
  (if: always(), retention-days: 30) so scan results survive failed builds
- Add noexec,nosuid to /var/cache/nginx and /var/run tmpfs mounts for
  consistent hardening across all tmpfs entries
- Add gzip_vary on to nginx.conf for correct Vary: Accept-Encoding header
- Mirror tmpfs changes in CLI template + golden files
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 15, 2026 11:26 — with GitHub Actions Inactive
- Guard against missing trivy-{backend,web}.json with clear error message
  before jq parsing (prevents cryptic "No such file" on Trivy step failure)
- Write vulnerability table and counts to GITHUB_STEP_SUMMARY so scan
  results appear on the job summary page without digging into raw logs
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 15, 2026 11:41 — with GitHub Actions Inactive
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/docker.yml (1)

284-351: 🧹 Nitpick | 🔵 Trivial

Consider refactoring to reduce duplication.

The web section (lines 284-351) is nearly identical to the backend section (lines 113-183). This duplication increases maintenance burden—any bug fix or improvement must be applied twice.

Options to consider:

  1. Composite action: Extract the Trivy scan + evaluate + upload logic into a local composite action under .github/actions/trivy-scan/action.yml
  2. Reusable workflow: Create .github/workflows/trivy-scan.yml with workflow_call trigger
  3. Matrix strategy: Use a matrix with context: [backend, web] if the jobs can be unified

This isn't blocking, but would improve maintainability as the scanning logic evolves.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 284 - 351, The Trivy/Grype/CIS
scan steps for the "web" job are duplicated from "backend"; extract the shared
logic into a reusable unit (either a local composite action
.github/actions/trivy-scan or a callable workflow
.github/workflows/trivy-scan.yml) that encapsulates the steps named "Trivy
scan", "Evaluate Trivy results", "Upload Trivy report (web)", "Grype scan", and
"CIS Docker Benchmark (web)"; accept inputs for the image ref, report
name/suffix, severity/config values, and a job-context label (e.g., web/backend)
so existing jobs call this unit with image-ref=${{ steps.scan-ref.outputs.ref }}
and a report name like trivy-${{ inputs.context }}-report, then replace the
duplicated blocks in both jobs with a single call to the composite action or
workflow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/docker.yml:
- Around line 302-303: The jq numeric-validation is missing for the scan totals
and severity counts: ensure TOTAL, CRITICAL and HIGH are produced as numbers
with a safe default (0) by updating their jq invocations to coerce or fallback
to 0 (e.g. use | tonumber // 0 or ... | length // 0) so the bash comparisons
won't fail on null/empty; update the assignments that set TOTAL, CRITICAL and
HIGH (the variables named TOTAL, CRITICAL, HIGH in the script) to use this
pattern.
- Around line 132-133: TOTAL, CRITICAL and HIGH may be non-numeric if jq fails;
update the shell logic that assigns these variables (the lines using TOTAL=$(jq
'[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json) and the
analogous CRITICAL/HIGH assignments) to validate jq output and coerce a safe
numeric fallback (e.g., check the variable with a numeric regex or test before
using -eq and set to 0 on failure); ensure subsequent comparisons use the
validated numeric value so [ "$TOTAL" -eq 0 ] cannot error if jq returns empty
or an error string.

---

Outside diff comments:
In @.github/workflows/docker.yml:
- Around line 284-351: The Trivy/Grype/CIS scan steps for the "web" job are
duplicated from "backend"; extract the shared logic into a reusable unit (either
a local composite action .github/actions/trivy-scan or a callable workflow
.github/workflows/trivy-scan.yml) that encapsulates the steps named "Trivy
scan", "Evaluate Trivy results", "Upload Trivy report (web)", "Grype scan", and
"CIS Docker Benchmark (web)"; accept inputs for the image ref, report
name/suffix, severity/config values, and a job-context label (e.g., web/backend)
so existing jobs call this unit with image-ref=${{ steps.scan-ref.outputs.ref }}
and a report name like trivy-${{ inputs.context }}-report, then replace the
duplicated blocks in both jobs with a single call to the composite action or
workflow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8d8f25ec-2052-4aeb-98f4-b2509882a9a6

📥 Commits

Reviewing files that changed from the base of the PR and between afc8e69 and c2eb02a.

📒 Files selected for processing (1)
  • .github/workflows/docker.yml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: CLI Test (windows-latest)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web
  • GitHub Check: Greptile Review
  • GitHub Check: Analyze (python)
🔇 Additional comments (3)
.github/workflows/docker.yml (3)

113-123: LGTM!

The unified JSON-based Trivy scan configuration is well-designed. Using exit-code: "0" to defer failure decisions to the evaluate step allows for nuanced handling (fail on CRITICAL only, warn on HIGH).


159-165: LGTM!

Using if: always() ensures the Trivy report is uploaded for debugging even when the evaluate step fails. Good practice for audit trails.


175-183: The CIS scan step will have trivy available on PATH. The aquasecurity/trivy-action step at line 116 runs first and, by default, invokes setup-trivy which explicitly adds the Trivy binary to $GITHUB_PATH—a persisted GitHub Actions variable that remains available across all subsequent steps in the workflow. The concern about trivy not being on PATH is not warranted. That said, the defensive command -v trivy fallback with warning is harmless additional safety.

			> Likely an incorrect or invalid review comment.

Comment on lines +132 to +133
TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json)
if [ "$TOTAL" -eq 0 ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add validation for jq output before arithmetic comparison.

If jq fails (e.g., malformed JSON from Trivy), TOTAL will be empty or contain an error message, causing [ "$TOTAL" -eq 0 ] to fail with "integer expression expected". The same applies to CRITICAL and HIGH variables on lines 147-148.

🛡️ Proposed fix to add validation
-          TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json)
+          TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json) || {
+            echo "::error::Failed to parse trivy-backend.json"
+            exit 1
+          }
+          if ! [[ "$TOTAL" =~ ^[0-9]+$ ]]; then
+            echo "::error::Invalid vulnerability count: $TOTAL"
+            exit 1
+          fi

Apply the same pattern to CRITICAL and HIGH variable assignments on lines 147-148.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json)
if [ "$TOTAL" -eq 0 ]; then
TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-backend.json) || {
echo "::error::Failed to parse trivy-backend.json"
exit 1
}
if ! [[ "$TOTAL" =~ ^[0-9]+$ ]]; then
echo "::error::Invalid vulnerability count: $TOTAL"
exit 1
fi
if [ "$TOTAL" -eq 0 ]; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 132 - 133, TOTAL, CRITICAL and
HIGH may be non-numeric if jq fails; update the shell logic that assigns these
variables (the lines using TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] |
length' trivy-backend.json) and the analogous CRITICAL/HIGH assignments) to
validate jq output and coerce a safe numeric fallback (e.g., check the variable
with a numeric regex or test before using -eq and set to 0 on failure); ensure
subsequent comparisons use the validated numeric value so [ "$TOTAL" -eq 0 ]
cannot error if jq returns empty or an error string.

Comment on lines +302 to +303
TOTAL=$(jq '[.Results[]?.Vulnerabilities[]?] | length' trivy-web.json)
if [ "$TOTAL" -eq 0 ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same jq validation issue as backend section.

Apply the same numeric validation fix as recommended for the backend section (lines 132-133). The CRITICAL and HIGH variables on lines 317-318 also need validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/docker.yml around lines 302 - 303, The jq
numeric-validation is missing for the scan totals and severity counts: ensure
TOTAL, CRITICAL and HIGH are produced as numbers with a safe default (0) by
updating their jq invocations to coerce or fallback to 0 (e.g. use | tonumber //
0 or ... | length // 0) so the bash comparisons won't fail on null/empty; update
the assignments that set TOTAL, CRITICAL and HIGH (the variables named TOTAL,
CRITICAL, HIGH in the script) to use this pattern.

@Aureliolo Aureliolo merged commit 4d32bca into main Mar 15, 2026
47 checks passed
@Aureliolo Aureliolo deleted the fix/docker-healthcheck-and-ci branch March 15, 2026 11:47
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview March 15, 2026 11:47 — with GitHub Actions Inactive
Comment on lines 167 to 173
@@ -137,6 +172,16 @@ jobs:
severity-cutoff: critical
config: .github/.grype.yaml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grype scan skipped when Trivy finds CRITICAL vulnerabilities

In the old design, the Trivy scan (high — warn only) step had continue-on-error: true, which kept the job in a "success" state so that Grype always ran independently. In the new design, Evaluate Trivy results calls exit 1 when CRITICAL vulnerabilities are found. Because Grype scan has no if condition it defaults to if: success() — so it is skipped entirely whenever Trivy flags a CRITICAL CVE.

This silently breaks the "two independent scanners" guarantee described in both CLAUDE.md and docs/security.md. If the Grype database contains a CRITICAL CVE that Trivy misses, that finding is only surfaced when Trivy's own scan is clean, which defeats the purpose of running two independent tools.

Adding if: always() to the Grype step (and, optionally, the CIS benchmark step) in both jobs restores independent coverage:

Suggested change
- name: Grype scan
if: always()
uses: anchore/scan-action@7037fa011853d5a11690026fb85feee79f4c946c # v7.3.2
with:
image: ${{ steps.scan-ref.outputs.ref }}
fail-build: true
severity-cutoff: critical
config: .github/.grype.yaml

The same applies to the Grype scan step in the build-web job (around line 337).

Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/docker.yml
Line: 167-173

Comment:
**Grype scan skipped when Trivy finds CRITICAL vulnerabilities**

In the old design, the `Trivy scan (high — warn only)` step had `continue-on-error: true`, which kept the job in a "success" state so that Grype always ran independently. In the new design, `Evaluate Trivy results` calls `exit 1` when CRITICAL vulnerabilities are found. Because `Grype scan` has no `if` condition it defaults to `if: success()` — so it is skipped entirely whenever Trivy flags a CRITICAL CVE.

This silently breaks the "two independent scanners" guarantee described in both `CLAUDE.md` and `docs/security.md`. If the Grype database contains a CRITICAL CVE that Trivy misses, that finding is only surfaced when Trivy's own scan is clean, which defeats the purpose of running two independent tools.

Adding `if: always()` to the Grype step (and, optionally, the CIS benchmark step) in both jobs restores independent coverage:

```suggestion
      - name: Grype scan
        if: always()
        uses: anchore/scan-action@7037fa011853d5a11690026fb85feee79f4c946c # v7.3.2
        with:
          image: ${{ steps.scan-ref.outputs.ref }}
          fail-build: true
          severity-cutoff: critical
          config: .github/.grype.yaml
```

The same applies to the `Grype scan` step in the `build-web` job (around line 337).

How can I resolve this? If you propose a fix, please make it concise.

Aureliolo added a commit that referenced this pull request Mar 15, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.2.4](v0.2.3...v0.2.4)
(2026-03-15)


### Bug Fixes

* attach cosign signatures and provenance bundle to release assets
([#438](#438))
([f191a4d](f191a4d))
* create git tag explicitly for draft releases
([#432](#432))
([1f5120e](1f5120e))
* docker healthcheck, CI optimization, and container hardening
([#436](#436))
([4d32bca](4d32bca))
* ensure security headers on all HTTP responses
([#437](#437))
([837f2fc](837f2fc))
* make install scripts usable immediately without terminal restart
([#433](#433))
([b45533c](b45533c))
* migrate pids_limit to deploy.resources.limits.pids
([#439](#439))
([66b94fd](66b94fd))


### Refactoring

* redesign release notes layout
([#434](#434))
([239aaf7](239aaf7))


### Maintenance

* **site:** replace hero CTA with license link and scroll arrow
([#440](#440))
([56af41c](56af41c))
* **web:** adopt @vue/tsconfig preset
([#435](#435))
([7d4b214](7d4b214))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
This was referenced Mar 15, 2026
Aureliolo added a commit that referenced this pull request Mar 15, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.2.4](v0.2.3...v0.2.4)
(2026-03-15)


### Bug Fixes

* attach cosign signatures and provenance bundle to release assets
([#438](#438))
([f191a4d](f191a4d))
* create git tag explicitly for draft releases
([#432](#432))
([1f5120e](1f5120e))
* docker healthcheck, CI optimization, and container hardening
([#436](#436))
([4d32bca](4d32bca))
* ensure security headers on all HTTP responses
([#437](#437))
([837f2fc](837f2fc))
* make install scripts usable immediately without terminal restart
([#433](#433))
([b45533c](b45533c))
* migrate pids_limit to deploy.resources.limits.pids
([#439](#439))
([66b94fd](66b94fd))
* use cosign --bundle flag for checksums signing
([#443](#443))
([19735b9](19735b9))


### Refactoring

* redesign release notes layout
([#434](#434))
([239aaf7](239aaf7))


### Maintenance

* **main:** release 0.2.4
([#431](#431))
([63b03c4](63b03c4))
* remove stale v0.2.4 changelog section from failed release
([#446](#446))
([769de10](769de10))
* reset version to 0.2.3 for re-release
([#444](#444))
([8579993](8579993))
* **site:** replace hero CTA with license link and scroll arrow
([#440](#440))
([56af41c](56af41c))
* **web:** adopt @vue/tsconfig preset
([#435](#435))
([7d4b214](7d4b214))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
This was referenced Mar 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant