Description
Problem Statement
The openclaw-inference-switch-e2e nightly job fails because nemoclaw inference set does not apply model-specific compatibility flags when switching to z-ai/glm-5.1. The inference.local PONG test passes (direct curl with max_tokens: 100), but the openclaw agent turn times out after 120 seconds with empty output (exit code 124).
The root cause is that patchOpenClawInferenceConfig() in inference-set.ts clears existing model compat (delete firstExistingModel.compat) and never consults the nemoclaw-blueprint/model-specific-setup/ registry. For nvidia-prod models, getSandboxInferenceConfig() returns inferenceCompat: null, so z-ai/glm-5.1 ends up without the maxTokensField: "max_tokens" flag it needs. OpenClaw then sends max_completion_tokens — a parameter the NVIDIA-proxied GLM endpoint does not support — and the agent hangs indefinitely.
Proposed Design
Add a glm-5.1-managed-inference.json manifest under nemoclaw-blueprint/model-specific-setup/openclaw/ declaring maxTokensField: "max_tokens" and requiresStringContent: true for z-ai/glm-5.1. Teach inference-set.ts to read matching model-specific manifests at switch time via a new loadOpenClawModelCompat() function in src/lib/inference/model-specific-setup.ts, ensuring runtime switches receive the same compat flags that generate-openclaw-config.py applies at build time.
See fix PR: #3778
Alternatives Considered
- Hardcode glm-5.1 compat in
getSandboxInferenceConfig() — rejected because the model-specific setup registry was designed specifically to keep compat declarations out of code conditionals.
- Build-time-only fix (manifest without inference-set.ts change) — insufficient because the test switches inference at runtime; the manifest would only help during initial sandbox build.
Category
config_error
Reproduction Steps
- Re-run
openclaw-inference-switch-e2e on commit 5a03166 via:
gh workflow run nightly-e2e.yaml --repo NVIDIA/NemoClaw --ref main \
-f jobs=openclaw-inference-switch-e2e
- Observe Phase 4 "Live requests after switch" — the
check_openclaw_agent_turn assertion fails with exit 124 (timeout), empty reply.
Environment
- OS: Ubuntu 24.04.4 LTS (GitHub-hosted runner ubuntu-latest, image 20260513.135.3)
- Node.js: v22.22.3 (installed via nvm during test)
- Docker: GitHub-hosted runner default
- NemoClaw: commit
5a031660ffc9d945c9e25d8cfe26409e40b18af3 (main)
- OpenClaw: 2026.4.24
- OpenShell: 0.0.39
- Other: Nightly run ID 26068303685
Debug Output
=== Phase 4: Live requests after switch ===
PASS: Sandbox inference.local returned PONG with z-ai/glm-5.1
FAIL: OpenClaw agent turn failed after switch (exit 124); reply='', raw=''
=== Phase 5: Cleanup ===
Shared NemoClaw gateway preserved. Re-run 'openshell gateway remove nemoclaw' to remove it,
or pass '--cleanup-gateway' / set NEMOCLAW_CLEANUP_GATEWAY=1 next time. (#2166)
✓ Sandbox 'e2e-openclaw-inference-switch' destroyed
PASS: Sandbox e2e-openclaw-inference-switch removed
========================================
OpenClaw inference switch E2E Results:
Passed: 14
Failed: 1
Skipped: 1
Total: 16
========================================
1 test(s) failed.
Process completed with exit code 1.
Logs
N/A
Checklist
Suggested Labels (apply manually after triage): nightly-e2e, auto-diagnosed, ci-failure, VRDC
Description
Problem Statement
The
openclaw-inference-switch-e2enightly job fails becausenemoclaw inference setdoes not apply model-specific compatibility flags when switching toz-ai/glm-5.1. The inference.local PONG test passes (direct curl withmax_tokens: 100), but theopenclaw agentturn times out after 120 seconds with empty output (exit code 124).The root cause is that
patchOpenClawInferenceConfig()ininference-set.tsclears existing model compat (delete firstExistingModel.compat) and never consults thenemoclaw-blueprint/model-specific-setup/registry. Fornvidia-prodmodels,getSandboxInferenceConfig()returnsinferenceCompat: null, soz-ai/glm-5.1ends up without themaxTokensField: "max_tokens"flag it needs. OpenClaw then sendsmax_completion_tokens— a parameter the NVIDIA-proxied GLM endpoint does not support — and the agent hangs indefinitely.Proposed Design
Add a
glm-5.1-managed-inference.jsonmanifest undernemoclaw-blueprint/model-specific-setup/openclaw/declaringmaxTokensField: "max_tokens"andrequiresStringContent: trueforz-ai/glm-5.1. Teachinference-set.tsto read matching model-specific manifests at switch time via a newloadOpenClawModelCompat()function insrc/lib/inference/model-specific-setup.ts, ensuring runtime switches receive the same compat flags thatgenerate-openclaw-config.pyapplies at build time.See fix PR: #3778
Alternatives Considered
getSandboxInferenceConfig()— rejected because the model-specific setup registry was designed specifically to keep compat declarations out of code conditionals.Category
config_errorReproduction Steps
openclaw-inference-switch-e2eon commit5a03166via:check_openclaw_agent_turnassertion fails with exit 124 (timeout), empty reply.Environment
5a031660ffc9d945c9e25d8cfe26409e40b18af3(main)Debug Output
Logs
N/A
Checklist
Suggested Labels (apply manually after triage):
nightly-e2e,auto-diagnosed,ci-failure,VRDC