Skip to content

Log all terminal states (error, stuck) in V1 callback processors#13549

Merged
raymyers merged 2 commits intomainfrom
log-all-terminal-states
Mar 24, 2026
Merged

Log all terminal states (error, stuck) in V1 callback processors#13549
raymyers merged 2 commits intomainfrom
log-all-terminal-states

Conversation

@erisfully
Copy link
Copy Markdown
Contributor

@erisfully erisfully commented Mar 23, 2026

Summary

This PR fixes conversation failure rate tracking by ensuring all terminal states are logged in V1 callback processors.

Problem

Previously, the V1 callback processors (GitHub, Slack, GitLab) only logged when conversations reached the finished state:

# Only act when execution has finished
if not (event.key == 'execution_status' and event.value == 'finished'):
    return None

_logger.info('[GitHub V1] Callback agent state was %s', event)

This meant error and stuck terminal states were never logged, making it impossible to track conversation failure rates from Datadog logs.

Solution

Move the logging to happen for ALL execution_status events, while still only requesting summaries when the status is finished:

if event.key != 'execution_status':
    return None

# Log ALL terminal states for monitoring (finished, error, stuck)
_logger.info('[GitHub V1] Callback agent state was %s', event)

# Only request summary when execution has finished successfully
if event.value != 'finished':
    return None

Impact

After this change, Datadog will receive logs for all terminal states:

  • [GitHub V1] Callback agent state was ConversationStateUpdate(key=execution_status, value=finished)
  • [GitHub V1] Callback agent state was ConversationStateUpdate(key=execution_status, value=error)
  • [GitHub V1] Callback agent state was ConversationStateUpdate(key=execution_status, value=stuck)

This enables accurate conversation failure rate monitoring using the formula:

Failure Rate = (ERROR + STUCK) / (ERROR + STUCK + FINISHED) * 100

Files Changed

  • enterprise/integrations/github/github_v1_callback_processor.py
  • enterprise/integrations/slack/slack_v1_callback_processor.py
  • enterprise/integrations/gitlab/gitlab_v1_callback_processor.py

Testing

The change is minimal and preserves existing behavior:

  • ✅ All execution_status events are logged (new behavior)
  • ✅ Summary requests only happen on finished (unchanged behavior)
  • ✅ Non-execution_status events are still ignored (unchanged behavior)

To run this PR locally, use the following command:

GUI with Docker:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.openhands.dev/openhands/runtime:71c337e-nikolaik   --name openhands-app-71c337e   docker.openhands.dev/openhands/openhands:71c337e

Previously, V1 callback processors (GitHub, Slack, GitLab) only logged when
conversations reached the 'finished' state. This made it impossible to track
conversation failure rates since 'error' and 'stuck' terminal states were
never logged.

This change moves the logging to happen for ALL execution_status events,
while still only requesting summaries when the status is 'finished'.

After this change, Datadog will receive logs like:
- [GitHub V1] Callback agent state was ConversationStateUpdate(key=execution_status, value=error)
- [GitHub V1] Callback agent state was ConversationStateUpdate(key=execution_status, value=stuck)

This enables accurate conversation failure rate monitoring in the engineering
KPIs dashboard.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 23, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  enterprise/integrations/github
  github_v1_callback_processor.py 47-60
  enterprise/integrations/gitlab
  gitlab_v1_callback_processor.py 45-58
  enterprise/integrations/slack
  slack_v1_callback_processor.py 44-59
Project Total  

This report was generated by python-coverage-comment-action

@erisfully erisfully marked this pull request as ready for review March 23, 2026 19:07
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Good taste - Simple, pragmatic solution.

Key Insight: The refactored code is actually MORE readable than the original. Replacing if not (event.key == 'execution_status' and event.value == 'finished') with two sequential checks eliminates the complex negative condition. The logging now correctly captures all terminal states (error, stuck, finished) for proper failure rate tracking.

Verdict: ✅ Worth merging. This solves a real production monitoring gap without adding complexity.

@raymyers raymyers merged commit 19da63a into main Mar 24, 2026
26 of 27 checks passed
@raymyers raymyers deleted the log-all-terminal-states branch March 24, 2026 18:04
@mamoodi mamoodi added the release:cloud-1.19.0 Included in release 1.19.0 label Mar 30, 2026 — with OpenHands AI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release:cloud-1.19.0 Included in release 1.19.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants