Skip to content

fix(docker): entrypoint integrity check for 0-byte server.js (#3542)#3543

Merged
Yeraze merged 2 commits into
mainfrom
fix/3542-entrypoint-integrity-check
Jun 18, 2026
Merged

fix(docker): entrypoint integrity check for 0-byte server.js (#3542)#3543
Yeraze merged 2 commits into
mainfrom
fix/3542-entrypoint-integrity-check

Conversation

@Yeraze

@Yeraze Yeraze commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Summary

Fixes #3542. A corrupt or incomplete docker compose pull can leave dist/server/server.js as a 0-byte file. Node then exits 0 immediately, supervisord retries until FATAL, and the container crash-loops with no actionable message beyond exited: meshmonitor (exit status 0; not expected).

This adds the recommended entrypoint integrity check: before handing off to supervisord, verify /app/dist/server/server.js exists and is non-empty. If not, print a clear error with the remediation and exit non-zero so the real cause is visible in docker compose logs.

❌ FATAL: /app/dist/server/server.js is missing or empty — the image may be corrupt or incompletely pulled.
   Fix: docker rmi ghcr.io/yeraze/meshmonitor && docker compose pull && docker compose up -d

Why this approach

Per the issue's analysis, the entrypoint check is the lowest-friction option — actionable error, zero ongoing maintenance (vs. SHA digest pinning) and it explains why (vs. a bare healthcheck).

Details

  • Added to docker/docker-entrypoint.sh immediately before the supervisord exec, so it runs ahead of both the root and non-root execution paths.
  • Confirmed entry path: npm startnode dist/server/server.js (working dir /app).
  • POSIX sh ([ ! -s … ]); sh -n syntax-clean.

Testing

  • sh -n passes.
  • Logic verified both ways: 0-byte file → detected (exit 1); non-empty file → passes with byte count.
  • On a healthy image the check is a no-op (✓ Server bundle present (N bytes)); CI's system tests exercise normal container startup.

🤖 Generated with Claude Code

A corrupt or incomplete image pull can leave dist/server/server.js as a 0-byte
file. Node exits 0 immediately, supervisord retries to FATAL, and the container
crash-loops with no actionable message beyond "exited: meshmonitor (exit status
0; not expected)".

Add an entrypoint integrity check before the supervisord handoff: if
/app/dist/server/server.js is missing or empty, print a clear error with the
remediation (docker rmi + pull) and exit non-zero so the real cause shows up in
`docker compose logs`. Runs ahead of both the root and non-root exec paths.

Fixes #3542

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review: fix(docker): entrypoint integrity check for 0-byte server.js

  • Gather context
  • Read changed files
  • Analyze code quality and best practices
  • Check for bugs or issues
  • Review security concerns
  • Assess test coverage
  • Post review feedback

Summary

This is a clean, focused fix. The logic is sound and the implementation follows the existing script's POSIX sh idioms. A few minor observations below.


Code Review

docker/docker-entrypoint.sh (lines 159–171)

SERVER_ENTRY="/app/dist/server/server.js"
if [ ! -s "$SERVER_ENTRY" ]; then
    echo "❌ FATAL: $SERVER_ENTRY is missing or empty — the image may be corrupt or incompletely pulled." >&2
    echo "   Fix: docker rmi ghcr.io/yeraze/meshmonitor && docker compose pull && docker compose up -d" >&2
    exit 1
fi
echo "✓ Server bundle present ($(wc -c < "$SERVER_ENTRY") bytes)"

Strengths:

  • [ ! -s … ] is the right POSIX test — it covers both "missing" and "empty (0 bytes)" in a single operator.
  • Writing to stderr (>&2) is correct for error messages.
  • Placement just before the supervisord exec paths means it runs regardless of root vs. non-root.
  • The success log line with byte count is a nice touch for debugging healthy startups.

Minor observations:

  1. Placement relative to chown: The check runs after chown -R node:node /data /app/dist (line 122). Since server.js is in /app/dist, if it is 0 bytes, chown runs on it silently first. This isn't a problem functionally (chown of a 0-byte file is harmless), but logically the integrity check is slightly more intuitive placed before the chown block. Low impact — this is style rather than a bug.

  2. Hardcoded image name in remediation message: Line 168 references ghcr.io/yeraze/meshmonitor directly. Users deploying custom builds or self-hosted registries will see a misleading command. Consider a generic <image> placeholder, or use an environment variable if one is already available:

    echo "   Fix: docker rmi <image> && docker compose pull && docker compose up -d" >&2

    Or simply omit the image name:

    echo "   Fix: docker compose pull && docker compose up -d" >&2
  3. wc -c subprocess on success path: $(wc -c < "$SERVER_ENTRY") forks a subprocess on every healthy startup to read the entire bundle for byte count. For a large production build this is a negligible cost (~milliseconds), but strictly speaking [ -s … ] already proved non-empty — the byte count is informational. Not a concern in practice; just noting it.

  4. set -e interaction: The script has set -e at line 2. The exit 1 is explicit, so behaviour is clear. No issue here — just confirming the exit 1 is redundant with set -e only if the if block itself were a failing command, which it isn't. The explicit exit 1 is the right approach for clarity.

  5. No check for the client bundle: If someone wants belt-and-suspenders coverage, checking for dist/client/index.html as well could catch a partial pull that corrupted only the client side. This is out of scope for this PR and genuinely optional — the server-side file is the critical path since Node exits immediately on a 0-byte file.


Verdict

Approve with one optional suggestion. The fix is correct and production-ready. The only change worth considering is swapping the hardcoded image name in the remediation message for a generic placeholder to avoid confusing users on custom builds (observation #2 above). Everything else is minor style/trivia.


…3542)

The remediation message hardcoded ghcr.io/yeraze/meshmonitor, which would be
misleading for custom builds or self-hosted registries. Use a generic
<meshmonitor image> placeholder instead (Claude review feedback).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Yeraze

Yeraze commented Jun 18, 2026

Copy link
Copy Markdown
Owner Author

Thanks for the review! Addressed the substantive suggestion in 93a8862e:

Declining the rest as noted (all flagged minor/optional by the review):

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Claude encountered an error —— View job


I'll analyze this and get back to you.

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Claude finished @Yeraze's task —— View job


PR Review: fix(docker): entrypoint integrity check for 0-byte server.js

  • Gather context
  • Read changed files
  • Analyze code quality and best practices
  • Post review feedback

Summary

This is a clean, well-scoped fix. The updated remediation message (commit 93a8862) correctly addresses the previous concern about the hardcoded image name — the new wording with <meshmonitor image> and "e.g." is clear and won't mislead users on custom builds.


Code Review (lines 159–172)

# Image integrity check (issue #3542): a corrupt or incomplete image pull can
# leave dist/server/server.js as a 0-byte file. Node then exits 0 immediately,
# supervisord retries to FATAL, and the container crash-loops with no actionable
# message beyond "exited: meshmonitor (exit status 0; not expected)". Verify the
# server bundle is present and non-empty here so the real cause is visible in
# `docker compose logs` instead.
SERVER_ENTRY="/app/dist/server/server.js"
if [ ! -s "$SERVER_ENTRY" ]; then
    echo "❌ FATAL: $SERVER_ENTRY is missing or empty — the image may be corrupt or incompletely pulled." >&2
    echo "   Fix: remove the cached image and re-pull, e.g.:" >&2
    echo "        docker rmi <meshmonitor image> && docker compose pull && docker compose up -d" >&2
    exit 1
fi
echo "✓ Server bundle present ($(wc -c < "$SERVER_ENTRY") bytes)"

Strengths:

  • [ ! -s … ] is the correct POSIX test — covers both "missing" and "0-byte" in one operator. No race condition since this is a single sequential check before exec.
  • Error output goes to stderr (>&2), correct for error messages.
  • The comment block is thorough — explains the root cause, the failure mode, and links the issue number.
  • Generalized remediation message with <meshmonitor image> placeholder is clear and avoids hardcoded registry assumptions.
  • Placement (before both exec paths for non-root and root) ensures the check always runs.
  • Healthy path log line with byte count aids debugging on good images.

No blocking issues. The implementation is correct, POSIX-compliant, and production-ready.

One trivial style note (no action needed): the wc -c subshell on line 172 reads the entire bundle file on every healthy startup. On a large production build this is a negligible cost, but if startup time ever becomes a concern, stat -c%s "$SERVER_ENTRY" is a lighter alternative that reads metadata only. This is purely informational — not a recommendation to change.


Verdict

LGTM. The fix is correct, the comment fully explains the root cause, the remediation message is now generic, and the previous review's suggestions have been addressed appropriately. Ready to merge.

@Yeraze Yeraze merged commit 6255fd0 into main Jun 18, 2026
14 of 16 checks passed
@Yeraze Yeraze deleted the fix/3542-entrypoint-integrity-check branch June 18, 2026 20:22
Yeraze added a commit that referenced this pull request Jun 18, 2026
…#152/153/155/156) (#3544)

* fix(security): use RE2 for user-supplied regexes to eliminate ReDoS (#3543)

Resolves the two high-severity CodeQL js/regex-injection alerts (#153, #155) by
compiling all user/admin-supplied regular expressions with RE2, a linear-time
engine immune to catastrophic backtracking. The previous heuristic guards
(length caps + nested-quantifier checks) reduced but could not eliminate the
risk, and CodeQL (correctly) kept flagging the user→RegExp dataflow.

- New src/utils/safeRegex.ts: compileUserRegex(pattern, flags) wraps RE2 and
  returns a RegExp-compatible matcher (test/exec/match/replace). RE2 rejects
  backreferences and lookaround — the intended trade-off — so callers keep their
  existing try/catch to treat unsupported patterns as "invalid regex".
- Convert every user-supplied regex site, validation AND execution:
  - server.ts: remote(-localstats) name-filter validation (2 sites).
  - services/database.ts: the name-filter execution sites that actually match
    node names (traceroute + localstats union filters) — the real ReDoS surface
    CodeQL didn't reach through the settings store.
  - routes/scriptRoutes.ts + meshtasticManager.ts: auto-responder trigger and
    auto-acknowledge pattern matching.
- Dockerfile: add build-base + python3 to the builder stage so re2 (native) can
  compile from source where no prebuilt exists (Alpine/musl arm). Builder stage
  only — not in the runtime image.

Tests: new safeRegex suite asserts linear-time behaviour on the classic (a+)+$
ReDoS pattern and rejection of backreference/lookaround. Full suite green.

The two non-regex alerts are handled separately: #152 (frontend routing, not a
security gate) dismissed as false-positive; #156 (authenticated admin script
import, path-sanitized) dismissed as by-design.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* address review: convert remaining regex sites to RE2; harden tests/docs (#3544)

Claude review feedback on the RE2 security PR:

1. autoAckRegex.ts:69 — convert the MeshCore auto-ack pattern compile to
   compileUserRegex (the charset allowlist/length/shape guards stay as a fast
   pre-rejection). Residual ReDoS surface CodeQL may not flag through the
   barrier, now closed for consistency.
2. settingsRoutes.ts:215 — convert the regex-validation compile to
   compileUserRegex so stored patterns are guaranteed RE2-compatible.
3. safeRegex.ts — document that the result is not `instanceof RegExp` and that
   length bounding stays the caller's responsibility; add a test asserting
   `.source`/`.flags` behave.
4. safeRegex.test.ts — raise the ReDoS timing threshold to 1000ms and use a
   larger input, so the assertion can't flake on loaded CI runners.
5. Dockerfile — clarify the builder comment (build-base is the new compiler;
   python3 is node-gyp's other requirement).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Yeraze Yeraze mentioned this pull request Jun 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Entrypoint integrity check: detect 0-byte server.js before supervisord starts

1 participant