Skip to content

Fix runtime status error on conversation resume#12718

Merged
tofarr merged 7 commits intomainfrom
fix-runtime-status-error-on-resume
Feb 2, 2026
Merged

Fix runtime status error on conversation resume#12718
tofarr merged 7 commits intomainfrom
fix-runtime-status-error-on-resume

Conversation

@tofarr
Copy link
Copy Markdown
Collaborator

@tofarr tofarr commented Feb 2, 2026

Summary of PR

This PR fixes an issue where resumed conversations would incorrectly show an error status during startup. The problem occurs because the runtime API marks sandboxes as RUNNING before they are actually fully started, particularly affecting resumed runtimes.

Changes in openhands/server/routes/manage_conversations.py:

  • Added detection logic in get_conversation() to handle the case where a sandbox is marked as RUNNING but has no execution status
  • When this condition is detected, the code checks server responsiveness via the /server_info endpoint
  • If the server is unresponsive or within a 60-second grace period (_RESUME_GRACE_PERIOD), the sandbox status is set to STARTING instead of potentially showing an error
  • Added proper warning logging with conversation and sandbox IDs for debugging

This is a temporary workaround for a bug in the runtime API that marks servers as RUNNING before they are actually started.

Demo Screenshots/Videos

When we resume a conversation...
image

** We have an initial starting state **
image

Before changes - We get an error state during startup
image

After Changes - The status displays as STARTING until the value is actually running
image

Finally it is ready
image

Change Type

  • Bug fix
  • New feature
  • Breaking change
  • Refactor
  • Other (dependency update, docs, typo fixes, etc.)

Checklist

  • I have read and reviewed the code and I understand what the code is doing.
  • I have tested the code to the best of my ability and ensured it works as expected.

Fixes

Resolves #(issue)

Release Notes

  • Include this change in the Release Notes.

Fixed an issue where resumed conversations would incorrectly display an error status during startup. The system now properly detects when a sandbox is still initializing and shows a "starting" status instead of an error.


To run this PR locally, use the following command:

GUI with Docker:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.openhands.dev/openhands/runtime:33cb95c-nikolaik   --name openhands-app-33cb95c   docker.openhands.dev/openhands/openhands:33cb95c

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 2, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  openhands/server/routes
  manage_conversations.py 490-530
Project Total  

This report was generated by python-coverage-comment-action

@tofarr tofarr marked this pull request as ready for review February 2, 2026 17:15
Comment thread openhands/server/routes/manage_conversations.py
@raymyers
Copy link
Copy Markdown
Contributor

raymyers commented Feb 2, 2026

Can this behavior change be reflected in either /conversations route tests in test_conversation_routes.py or a direct test around _filter_conversations_by_age function?

openhands-agent and others added 2 commits February 2, 2026 20:12
Tests cover the behavior when a sandbox is marked as RUNNING but has no
execution_status (indicating the server may still be starting):

- Server responds with uptime within grace period -> shows STARTING
- Server responds with uptime past grace period -> shows RUNNING
- Server is unresponsive -> shows STARTING
- Sandbox with execution_status set -> skips server check
- Non-RUNNING sandbox -> skips server check

Co-authored-by: openhands <openhands@all-hands.dev>


@pytest.mark.asyncio
async def test_get_conversation_running_sandbox_no_execution_status_server_responds_within_grace_period():
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: maybe we can parameterize these tests

Copy link
Copy Markdown
Collaborator

@malhotra5 malhotra5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!!

openhands-agent and others added 2 commits February 2, 2026 20:36
Consolidate 5 separate test functions into a single parameterized test
for brevity, as suggested in code review.

Co-authored-by: openhands <openhands@all-hands.dev>
@tofarr tofarr enabled auto-merge (squash) February 2, 2026 20:43
@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented Feb 2, 2026

Looks like there are a few issues preventing this PR from being merged!

  • GitHub Actions are failing:
    • Check Package Versions
    • Run Python Tests

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #12718 at branch `fix-runtime-status-error-on-resume`

Feel free to include any additional details that might help me get this PR into a better state.

You can manage your notification settings

@tofarr tofarr merged commit 1bb4c84 into main Feb 2, 2026
15 of 18 checks passed
@tofarr tofarr deleted the fix-runtime-status-error-on-resume branch February 2, 2026 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants