Skip to content

Docs: add deterministic replay architecture and TDD plan#108

Merged
dgarson merged 1 commit intofeat/deterministic-replayfrom
codex/evaluate-efficiency-of-message-log-structure
Feb 24, 2026
Merged

Docs: add deterministic replay architecture and TDD plan#108
dgarson merged 1 commit intofeat/deterministic-replayfrom
codex/evaluate-efficiency-of-message-log-structure

Conversation

@dgarson
Copy link
Owner

@dgarson dgarson commented Feb 23, 2026

Motivation

  • Provide a concrete architecture and phased implementation strategy for deterministic replay of agent runs to enable offline replay, debugging, and divergence diagnostics.
  • Reuse existing low-risk runtime seams (src/infra/agent-events.ts, runEmbeddedPiAgent and attempt flow) to minimize intrusive changes and implementation complexity.
  • Define a clear fidelity contract, redaction policy, and rollout plan so capture/replay can be adopted incrementally without breaking current behavior.

Description

  • Add a comprehensive design document at src/agents/deterministic-replay-design.md that defines core components: CaptureEvent, CaptureSink, CaptureContext, ReplaySource, and ReplayEngine.
  • Specify a minimal, incremental implementation approach including an event-first FileCaptureSink (JSONL) and a CaptureManifest format, with later adapters for Date.now()/Math.random() and network/tool wrappers (bash-tools.exec-runtime.ts, web-fetch.ts, web-search.ts).
  • Provide a phased plan (capture-only, replay mode, deeper sealing) and a small API surface for runtime injection via a CaptureContext with recordEvent, recordNondeterministic, and nextNondeterministic helpers.
  • Deliver a TDD matrix listing unit, integration, adapter, and compatibility tests to be added under src/agents/replay/* and concrete integration test targets that reuse existing reasoning-replay tests for fidelity checks.

Testing

  • Ran pnpm format to format the new document and the command completed successfully.
  • No automated unit/integration tests were added or executed as part of this change; the design file includes a prioritized TDD plan for follow-up implementation tests.

Codex Task

@dgarson dgarson changed the base branch from main to dgarson/fork February 23, 2026 18:22
@dgarson dgarson changed the base branch from dgarson/fork to feat/deterministic-replay February 24, 2026 01:04
@dgarson dgarson merged commit 37c2424 into feat/deterministic-replay Feb 24, 2026
2 of 10 checks passed
dgarson added a commit that referenced this pull request Feb 24, 2026
* feat: scaffold deterministic replay framework

* feat(replay): add manifest schema validation helpers

* feat(sessions): add replay session manifest types and serialization helpers

Add session-level replay manifest types distinct from the replay bundle
format in src/replay/types.ts (Nate's work in PR #92).

- ReplaySessionManifest: session metadata for replay-capable sessions
- ReplaySessionEvent: session-level events in the replay lifecycle
- ReplaySessionEventLog: collection of session events
- Serialization helpers: parse/serialize functions
- Factory functions: createReplaySessionId, createReplaySessionManifest, createReplaySessionEvent
- exportSessionManifest stub: placeholder for future storage integration

Add 43 focused tests covering:
- Schema validation for all types
- Serialization round-trips
- Factory function behavior
- Edge cases and error handling

* feat(replay): parse replay events and normalize recording categories

* feat(sessions): add replay session manifest types and serialization helpers (#98)

Add session-level replay manifest types distinct from the replay bundle
format in src/replay/types.ts (Nate's work in PR #92).

- ReplaySessionManifest: session metadata for replay-capable sessions
- ReplaySessionEvent: session-level events in the replay lifecycle
- ReplaySessionEventLog: collection of session events
- Serialization helpers: parse/serialize functions
- Factory functions: createReplaySessionId, createReplaySessionManifest, createReplaySessionEvent
- exportSessionManifest stub: placeholder for future storage integration

Add 43 focused tests covering:
- Schema validation for all types
- Serialization round-trips
- Factory function behavior
- Edge cases and error handling

* docs: add deterministic replay architecture and TDD plan (#108)

* feat: scaffold deterministic replay framework

* feat(replay): add manifest schema validation helpers

* feat(sessions): add replay session manifest types and serialization helpers

Add session-level replay manifest types distinct from the replay bundle
format in src/replay/types.ts (Nate's work in PR #92).

- ReplaySessionManifest: session metadata for replay-capable sessions
- ReplaySessionEvent: session-level events in the replay lifecycle
- ReplaySessionEventLog: collection of session events
- Serialization helpers: parse/serialize functions
- Factory functions: createReplaySessionId, createReplaySessionManifest, createReplaySessionEvent
- exportSessionManifest stub: placeholder for future storage integration

Add 43 focused tests covering:
- Schema validation for all types
- Serialization round-trips
- Factory function behavior
- Edge cases and error handling

* feat(replay): parse replay events and normalize recording categories

* feat(sessions): add replay session manifest types and serialization helpers (#98)

Add session-level replay manifest types distinct from the replay bundle
format in src/replay/types.ts (Nate's work in PR #92).

- ReplaySessionManifest: session metadata for replay-capable sessions
- ReplaySessionEvent: session-level events in the replay lifecycle
- ReplaySessionEventLog: collection of session events
- Serialization helpers: parse/serialize functions
- Factory functions: createReplaySessionId, createReplaySessionManifest, createReplaySessionEvent
- exportSessionManifest stub: placeholder for future storage integration

Add 43 focused tests covering:
- Schema validation for all types
- Serialization round-trips
- Factory function behavior
- Edge cases and error handling

* docs: add deterministic replay architecture and TDD plan (#108)

* Replay: harden recorder and deterministic clock edge cases (#182)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant