Summary
test/e2e/test-runtime-overrides.sh test 14 ("Config unchanged after rejected override") crashes the container and aborts the test script. Introduced by #2659.
Root Cause
PR #2659 tightened apply_model_override() to require NEMOCLAW_MODEL_OVERRIDE or NEMOCLAW_INFERENCE_API_OVERRIDE to be set before triggering. Test 14 was updated to pass NEMOCLAW_MODEL_OVERRIDE=test alongside NEMOCLAW_CONTEXT_WINDOW=notanumber.
The validation logic in scripts/nemoclaw-start.sh does return 1 when NEMOCLAW_CONTEXT_WINDOW is not a valid integer. Under set -euo pipefail, this kills the entrypoint before CMD runs. The container exits non-zero, and the test script (also under set -euo pipefail) aborts at the CFG=$(run_override ...) line without printing PASS/FAIL.
Expected Behavior
Invalid overrides should be rejected gracefully — log a security warning to stderr and skip the override, but continue starting the container. The config should remain unchanged (build-time defaults preserved).
Actual Behavior
Container exits immediately with code 1. Test cannot read config. Test script aborts.
Affected
- Nightly
runtime-overrides-e2e job on main (run 25118534977)
- Sparky
runtime-overrides dispatch (run 25118528500)
Fix
Change validation failures in apply_model_override() from return 1 to return 0 for non-security-critical rejections (malformed integers, invalid boolean, invalid API type). The security message is still logged to stderr, but the container starts normally with the config unchanged.
Symlink and control-character rejections should remain return 1 (those indicate tampering).
Related
Summary
test/e2e/test-runtime-overrides.shtest 14 ("Config unchanged after rejected override") crashes the container and aborts the test script. Introduced by #2659.Root Cause
PR #2659 tightened
apply_model_override()to requireNEMOCLAW_MODEL_OVERRIDEorNEMOCLAW_INFERENCE_API_OVERRIDEto be set before triggering. Test 14 was updated to passNEMOCLAW_MODEL_OVERRIDE=testalongsideNEMOCLAW_CONTEXT_WINDOW=notanumber.The validation logic in
scripts/nemoclaw-start.shdoesreturn 1whenNEMOCLAW_CONTEXT_WINDOWis not a valid integer. Underset -euo pipefail, this kills the entrypoint before CMD runs. The container exits non-zero, and the test script (also underset -euo pipefail) aborts at theCFG=$(run_override ...)line without printing PASS/FAIL.Expected Behavior
Invalid overrides should be rejected gracefully — log a security warning to stderr and skip the override, but continue starting the container. The config should remain unchanged (build-time defaults preserved).
Actual Behavior
Container exits immediately with code 1. Test cannot read config. Test script aborts.
Affected
runtime-overrides-e2ejob on main (run 25118534977)runtime-overridesdispatch (run 25118528500)Fix
Change validation failures in
apply_model_override()fromreturn 1toreturn 0for non-security-critical rejections (malformed integers, invalid boolean, invalid API type). The security message is still logged to stderr, but the container starts normally with the config unchanged.Symlink and control-character rejections should remain
return 1(those indicate tampering).Related