Skip to content

[Ubuntu 24.04][Sandbox] Hermes startup leaves download_failed Tirith marker after restart #4511

@zNeill

Description

@zNeill

Description

Description

Template 6047909 validates that Hermes startup recovers when /sandbox/.hermes/.tirith-install-failed contains download_failed. On vm2 with NemoClaw v0.0.53, the Hermes sandbox returns to Ready after active-runtime restart, but the download_failed marker remains and no tirith/download_failed cleanup retry message is emitted.Environment

Device: vm2, NVIDIA A100-SXM4-40GB
OS: Ubuntu 24.04.4 LTS
Architecture: x86_64
Docker: Docker version 29.5.2
OpenShell CLI: openshell 0.0.44
NemoClaw: nemoclaw v0.0.53
NemoHermes: nemohermes v0.0.53
OpenClaw: OpenClaw v2026.5.22 / Hermes Agent v2026.5.16Steps to Reproduce

  1. Start with a Ready Hermes sandbox named hermes-rerun.
  2. Plant the marker:
    nemoclaw hermes-rerun exec -- sh -lc 'mkdir -p /sandbox/.hermes && printf download_failed > /sandbox/.hermes/.tirith-install-failed'
  3. Restart the active Docker-driver container:
    docker restart openshell-hermes-rerun-...
  4. Wait 30 seconds and run:
    nemohermes hermes-rerun status
  5. Check marker and startup log:
    nemoclaw hermes-rerun exec -- sh -lc 'ls /sandbox/.hermes/.tirith-install-failed 2>&1 || echo marker absent'
    nemoclaw hermes-rerun exec -- sh -lc 'grep -i "tirith|download_failed" /tmp/nemoclaw-start.log | tail -20 || true'Expected Result

Hermes startup should remove the download_failed marker or otherwise complete the documented recovery path. The marker check should print marker absent, and startup logs should include the expected retry/cleanup message.

Actual Result

The sandbox returned to Phase=Ready and Inference healthy, but the marker still existed:

/sandbox/.hermes/.tirith-install-failed

The startup log grep produced no tirith/download_failed cleanup evidence.

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword NemoClaw, NemoClaw_Agent&Skills, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Sandbox

[NVB#6239962]

Metadata

Metadata

Assignees

Labels

NV QABugs found by the NVIDIA QA Teamarea: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryintegration: hermesHermes integration behaviorplatform: ubuntuAffects Ubuntu Linux environments

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions