Skip to content

fix: show historic execution context in iteration detail view#117

Merged
subsy merged 8 commits intomainfrom
fix/iteration-detail-historic-context
Jan 17, 2026
Merged

fix: show historic execution context in iteration detail view#117
subsy merged 8 commits intomainfrom
fix/iteration-detail-historic-context

Conversation

@subsy
Copy link
Copy Markdown
Owner

@subsy subsy commented Jan 17, 2026

Summary

When viewing completed iterations, the TUI now shows the agent, model, and sandbox configuration that was actually used during execution rather than the current settings.

Problem: Previously, if you ran iterations with agent A and model M1, then quit and reopened ralph-tui with agent B and model M2, viewing the completed iteration details would incorrectly show B and M2 instead of A and M1.

Solution: Store execution context (agent, model, sandbox config) in iteration log metadata and display historic values when viewing completed iterations.

Changes

  • Add sandbox fields to IterationLogMetadata (sandboxMode, resolvedSandboxMode, sandboxNetwork)
  • Update persistence layer to store and parse sandbox config
  • Add HistoricExecutionContext interface and historicContext prop to IterationDetailView
  • Load historic metadata from persisted logs in RunApp
  • Pass sandboxConfig when saving iteration logs from engine
  • Load and display historic agent/model in RightPanel for completed iterations
  • Add displayAgentInfo memo to compute correct agent/model based on context

Test plan

  • Run bun run typecheck - passes
  • Run bun run build - passes
  • Run bun test tests/logs/persistence.test.ts - 14 tests pass
  • Manual test: Run ralph-tui with agent X, complete an iteration, quit, restart with agent Y, view completed iteration details - should show agent X

Summary by CodeRabbit

  • New Features

    • Sandbox configuration (mode/network) is now included in iteration logs and shown in iteration details.
    • Completed iterations surface historic execution context (agent, model, sandbox info) in the detail view.
  • Tests

    • Added comprehensive persistence tests covering metadata round-trips, sandbox scenarios, filename/dir handling and parsing.
    • Added tests for agent output parsing variations.
  • Chores

    • .gitignore updated to ignore .beads/daemon-error.

✏️ Tip: You can customize this high-level summary in your review settings.

When viewing completed iterations, the detail view now shows the agent,
model, and sandbox configuration that was actually used during execution
rather than the current settings.

Changes:
- Add sandbox fields to IterationLogMetadata (sandboxMode, resolvedSandboxMode, sandboxNetwork)
- Update persistence layer to store and parse sandbox config
- Add HistoricExecutionContext interface and prop to IterationDetailView
- Load historic metadata from persisted logs in RunApp
- Pass sandboxConfig when saving iteration logs from engine
- Load and display historic agent/model in RightPanel for completed iterations
- Add displayAgentInfo memo to compute correct agent/model based on context

Also adds tests for sandbox config persistence in logs/persistence.test.ts
@vercel
Copy link
Copy Markdown

vercel bot commented Jan 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
ralph-tui Ignored Ignored Jan 17, 2026 8:49am

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jan 17, 2026

Walkthrough

The PR records sandbox configuration (mode, resolved mode, network) into iteration log metadata, threads sandboxConfig from the engine into persistence, surfaces historic execution context (agent/model/sandbox) in the TUI, and adds tests validating persistence, filenames and output parsing.

Changes

Cohort / File(s) Summary
Logging core
src/logs/persistence.ts, src/logs/types.ts
Added sandbox fields (sandboxConfig, resolvedSandboxMode) to options; extended IterationLogMetadata with sandboxMode, resolvedSandboxMode, sandboxNetwork; updated header render/parse, metadata build/load, and exported test helpers.
Engine integration
src/engine/index.ts
Thread this.config.sandbox into saveIterationLog payload as sandboxConfig (end‑of‑iteration logging).
TUI / Historic context
src/tui/components/IterationDetailView.tsx, src/tui/components/RunApp.tsx
Added HistoricExecutionContext and historicContext prop; propagate historic agent/model/sandbox from logs into UI; extended historical cache to store agentPlugin/model; unified selection via effectiveTaskId.
Tests
tests/logs/persistence.test.ts, tests/tui/output-parser.test.ts
New comprehensive persistence tests for sandbox fields, auto/resolved modes, round‑trip save/load, filename/dir logic; added output-parser tests for agentPlugin variants.
Misc
.gitignore
Added .beads/daemon-error ignore rule.

Sequence Diagram(s)

sequenceDiagram
    participant Engine as Engine
    participant Persistence as LogPersistence
    participant Disk as Disk
    participant TUI as TUI

    Engine->>Persistence: saveIterationLog(result, { sandboxConfig, resolvedSandboxMode })
    Persistence->>Persistence: buildMetadata(sandboxConfig, resolvedSandboxMode)
    Persistence->>Persistence: formatMetadataHeader(sandboxMode, resolvedSandboxMode, sandboxNetwork)
    Persistence->>Disk: write(metadata + content)

    Disk-->>Persistence: file stored
    TUI->>Disk: loadIterationLog(filePath)
    Disk-->>Persistence: read(log file)
    Persistence->>Persistence: parseMetadataHeader() → IterationLogMetadata{agentPlugin, model, sandboxMode, resolvedSandboxMode, sandboxNetwork}
    Persistence-->>TUI: iteration result + metadata
    TUI->>TUI: set historicContext from metadata
    TUI->>TUI: render IterationDetailView(historicContext)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🐰
I hopped a log from engine deep,
Saved mode and net for dreams to keep,
Agent, model in the thread,
TUI reads what history said,
Tests applaud with tiny leaps. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: show historic execution context in iteration detail view' directly and clearly describes the main change: displaying historic execution context in the iteration detail view component.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/tui/components/RunApp.tsx (2)

1140-1199: Handle log-load failures to avoid unhandled rejections.
A rejected getIterationLogsByTask currently bubbles without a catch, which can leave the cache empty and retrigger disk reads. Add error handling and cache a sentinel value.

💡 Suggested error handling
-      getIterationLogsByTask(cwd, taskId).then((logs) => {
+      getIterationLogsByTask(cwd, taskId)
+        .then((logs) => {
           if (logs.length > 0) {
             // ...
           } else {
             setHistoricalOutputCache((prev) => {
               const next = new Map(prev);
               next.set(taskId, { output: '', timing: {} });
               return next;
             });
           }
-      });
+        })
+        .catch((err) => {
+          console.error('Failed to load iteration logs:', err);
+          setHistoricalOutputCache((prev) => {
+            const next = new Map(prev);
+            next.set(taskId, { output: '', timing: {} });
+            return next;
+          });
+        });

1201-1262: Guard against stale async updates when switching iterations.
The log fetch can resolve after a different iteration is selected, briefly showing the wrong historic context. Clear state up-front and ignore stale responses.

🧹 Suggested stale-response guard
   useEffect(() => {
     if (viewMode !== 'iteration-detail' || !detailIteration || !cwd) {
       // Clear data when not in detail view
       setIterationDetailSubagentTree(undefined);
       setIterationDetailSubagentStats(undefined);
       setIterationDetailSubagentLoading(false);
       setIterationDetailHistoricContext(undefined);
       return;
     }
 
+    let cancelled = false;
+    setIterationDetailHistoricContext(undefined);
+    setIterationDetailSubagentTree(undefined);
+    setIterationDetailSubagentStats(undefined);
+
     // Check if we already have the stats cached
     const cachedStats = subagentStatsCache.get(detailIteration.iteration);
     if (cachedStats) {
       setIterationDetailSubagentStats(cachedStats);
     }
 
     setIterationDetailSubagentLoading(true);
 
     getIterationLogsByTask(cwd, detailIteration.task.id).then(async (logs) => {
+      if (cancelled) return;
       // existing logic...
     }).catch(() => {
+      if (cancelled) return;
       setIterationDetailSubagentLoading(false);
       setIterationDetailSubagentTree(undefined);
       setIterationDetailSubagentStats(undefined);
       setIterationDetailHistoricContext(undefined);
     });
+    return () => {
+      cancelled = true;
+    };
   }, [viewMode, detailIteration, cwd, subagentStatsCache]);
🤖 Fix all issues with AI agents
In @.beads/daemon-error:
- Around line 12-13: Update the message that suggests running "rm -rf .beads &&
bd init" to explicitly instruct users to execute it from the repository root or
only after verifying the .beads path exists; modify the sentence in
.beads/daemon-error that currently shows the command so it reads something like
"From the repository root (or after verifying the .beads path), run: rm -rf
.beads && bd init" to avoid accidental destructive runs outside the repo.

In `@src/tui/components/RunApp.tsx`:
- Around line 1104-1138: The displayAgentInfo computation can become
inconsistent with the right-panel output because it uses
iterationSelectedIndex/selectedIter while the output elsewhere still reads from
selectedTask; unify the selection source so both components use the same
resolved item: when viewMode === 'iterations' derive a single selectedEntity
(e.g., selectedIter from iterations[iterationSelectedIndex]) and use that same
selectedEntity to read agent/model and the right-panel output instead of reading
selectedTask; update displayAgentInfo to use selectedIter (and
historicalOutputCache.get(selectedIter.task.id)) and ensure the right-panel code
reads the same selectedIter/selectedEntity so agent/model and output remain in
sync (referencing displayAgentInfo, iterationSelectedIndex, selectedTask,
iterations, selectedIter, viewMode, historicalOutputCache).

subsy added 2 commits January 17, 2026 07:57
- Remove accidentally committed .beads/daemon-error file
- Add .beads/daemon-error to .gitignore
- Unify selection source: create effectiveTaskId to ensure displayAgentInfo
  and RightPanel use the same task reference across both views
- Simplify displayAgentInfo to use effectiveTaskId consistently
- Skip file I/O tests in CI due to module mocking interference

The effectiveTaskId ensures that when viewing iterations, the agent/model
display uses the task ID from the selected iteration rather than potentially
mismatching with the tasks list selection.
selectedTaskIteration was using selectedTask.id to look up iteration
data, but displayAgentInfo was using effectiveTaskId. This caused a
mismatch in iterations view where the output could come from a
different task than the one selected.

Now both use effectiveTaskId which correctly handles:
- Tasks view: uses selected task's ID
- Iterations view: uses selected iteration's task ID

This ensures the output and agent info are always for the same task.
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 17, 2026

Codecov Report

❌ Patch coverage is 73.13433% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.29%. Comparing base (f9829c5) to head (a016ea7).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/logs/persistence.ts 72.72% 18 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #117      +/-   ##
==========================================
+ Coverage   41.95%   43.29%   +1.33%     
==========================================
  Files          59       59              
  Lines       12690    12728      +38     
==========================================
+ Hits         5324     5510     +186     
+ Misses       7366     7218     -148     
Files with missing lines Coverage Δ
src/engine/index.ts 64.92% <100.00%> (+0.03%) ⬆️
src/logs/types.ts 100.00% <ø> (ø)
src/logs/persistence.ts 37.22% <72.72%> (+33.50%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/tui/components/RunApp.tsx`:
- Around line 1075-1077: The memo incorrectly uses selectedTask (from
displayedTasks[selectedIndex]) to compute isActiveTask causing live output from
another task to be shown; update the logic to derive activity from the effective
task (use selectedIteration?.task?.status or lookup the task by effectiveTaskId)
so isActiveTask reflects the effectiveTask's status, then recompute isExecuting
(currentTaskId === effectiveTaskId || isActiveTask) to ensure the memo returns
the correct iteration output for the effectiveTaskId; adjust any null/undefined
checks around selectedIteration, effectiveTaskId, and currentTaskId accordingly
(symbols: selectedTask, selectedIteration, effectiveTaskId, isActiveTask,
isExecuting, currentTaskId, displayedTasks, selectedIndex).
🧹 Nitpick comments (2)
tests/logs/persistence.test.ts (2)

128-131: Skipping tests in CI may mask regressions.

Whilst the comment explains the module mocking interference issue, skipping file I/O tests in CI means these critical round-trip tests won't catch regressions in the CI pipeline. Consider:

  1. Isolating the mock scope in execution-engine.test.ts to prevent global interference
  2. Using separate test runs or test configurations to run persistence tests in isolation in CI
  3. Using bun test --preload to control module resolution order

This is acceptable as a temporary workaround, but tracking this as technical debt would be prudent.


165-172: Missing assertion for resolvedSandboxMode in round-trip test.

The test saves resolvedSandboxMode: 'bwrap' but doesn't verify it's correctly loaded. Consider adding an assertion for completeness:

Suggested addition
     expect(loaded!.metadata.sandboxMode).toBe('bwrap');
+    expect(loaded!.metadata.resolvedSandboxMode).toBe('bwrap');
     expect(loaded!.metadata.sandboxNetwork).toBe(false);

Tests verify that output parsing works correctly with:
- Explicit claude agentPlugin
- Undefined agentPlugin (fallback to generic parser)
- Droid agentPlugin (uses droid-specific parser)
- Short unparseable JSON handling

These tests are relevant to the fix for iteration output display, where
the correct agent name must be passed to parseAgentOutput.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/tui/output-parser.test.ts`:
- Around line 131-137: The test name "should return empty for short unparseable
JSON" contradicts the assertion which expects the raw short string; update the
test description to reflect the actual behavior (e.g., rename to "should return
raw output for short unparseable JSON") or, if the intended behavior is to
return empty, change the assertion to expect '' instead; locate the test using
the parseAgentOutput call in the test block and adjust only the test title or
the expect(...) accordingly so name and assertion match.

subsy added 3 commits January 17, 2026 08:29
RunApp.tsx:
- The isActiveTask check was incorrectly using selectedTask (from the tasks
  view) even when in iterations view. This could show live output from the
  wrong task.
- Now properly derives the effective task status:
  - Iterations view: uses selectedIteration.task.status
  - Tasks view: uses selectedTask.status
- Also handles 'in_progress' status in addition to 'active' for robustness.

output-parser.test.ts:
- Fixed test name to match actual behavior: "should return raw output for
  short unparseable JSON" (was incorrectly named "should return empty").
Tests cover:
- Old config-only signature for backward compatibility
- epicId from config
- error from result
- completionSummary option
- agentSwitches with fallback/primary reasons
- Empty agentSwitches array handling
- Task description from result
- All basic fields copied from result

These tests improve coverage of the persistence module's buildMetadata
function which is used to construct iteration log metadata.
Exports internal functions (formatMetadataHeader, parseMetadataHeader,
formatDuration) via __test__ object for unit testing.

Adds comprehensive tests for:
- formatDuration: seconds, minutes, hours formatting
- formatMetadataHeader: all optional fields including description
  truncation, error, epicId, sandbox config, completion summary,
  agent switches
- parseMetadataHeader: round-trip tests, sandbox config parsing,
  error field, epicId

This improves patch coverage to meet the 50% threshold by testing
the newly added sandbox configuration handling code paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant