Refactor runtime state into SQLite by steipete · Pull Request #78595 · openclaw/openclaw

steipete · 2026-05-06T19:12:30Z

Summary

This PR is the database-first runtime/state refactor for #78096. It moves OpenClaw away from scattered JSON, JSONL, sidecar SQLite, lock-file, pruning, and truncation-repair paths toward a typed SQLite storage model with one shared control-plane database and one data-plane database per agent.

Current rule of the road:

openclaw.json, plugin manifests, Git checkouts, explicit credential source files, and real user workspaces remain file-backed configuration or user content.
Runtime data, caches, ledgers, auth profiles, session rows, transcript events, plugin state, scheduler state, task state, and agent scratch/artifact state live in SQLite.
Legacy files are migration inputs, explicit debug/export output, or real workspace files. They are not compatibility stores that normal runtime code keeps reading and writing.
Runtime code must not write session JSON, transcript JSONL, auth-profile JSON, plugin-cache JSON, or file-backed ACPX/session stores anymore.

Refs #78096

Current State

Updated May 10, 2026.

The PR branch has the main architecture in place:

global state DB for gateway/control-plane data
per-agent DB for session/transcript/VFS/artifact data
generated Kysely types from SQL schemas
native node:sqlite runtime access
doctor/migrate as the legacy-file import boundary
SQLite session rows and transcript events
SQLite auth profiles and model catalog/cache state
SQLite plugin state/blob stores
SQLite ACPX session state
SQLite VFS, tool artifact, and run artifact stores
worker-prepared run plumbing with serializable storage boundaries

The active cleanup pass is now focused on deleting/refactoring stale file-era assumptions from tests and runtime seams. Recent cleanup removed or rewired tests that still pretended active sessions/transcripts/auth lived in JSON/JSONL files, and tightened docs so transcript JSONL is migration-only input, never a runtime locator or bridge.

This is not merge-ready until the latest cleanup is pushed and the current Crabbox/Testbox validation is green.

Reviewer Mental Model

Review this as a storage-boundary refactor, not as a single session bug fix.

Global database = control plane. ~/.openclaw/state/openclaw.sqlite owns gateway-wide registries, plugin state/blob rows, cron/task/flow state, queues, sandbox/browser/device/pairing data, migration bookkeeping, auth-profile KV rows, and shared caches.
Per-agent database = data plane. ~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite owns sessions, transcripts, ACP/subagent rows, VFS rows, tool/run artifacts, and agent-local cache data for that one workspace.
Configuration remains files. openclaw.json, plugin manifests, Git checkouts, and real workspace files stay file-backed by design. Credential source files remain only where the user explicitly configured a file-backed secret source; auth profile runtime state itself is SQLite-backed.
Doctor/migrate is the compatibility boundary. Legacy JSON/JSONL/sidecar SQLite files are imported once, verified, recorded, and removed after successful migration. Runtime code should not silently import, repair, prune, or rewrite legacy session/cache/auth files while handling normal gateway traffic.
Kysely is the typed query layer. SQL files are the schema source of truth, generated Kysely types come from a real temporary SQLite database, and runtime uses OpenClaw's native node:sqlite dialect rather than a second SQLite runtime driver.
Worker/VFS readiness is part of the shape. Agent workers get serializable storage boundaries and open their own global/per-agent DB handles; gateway-owned streaming, cancellation, and delivery stay in the parent process.

Database Layout

Global State Database

~/.openclaw/state/openclaw.sqlite

Owns data that must be visible across the gateway and agents:

agent registry and per-agent database discovery
auth-profile secrets/state KV rows and SQLite-backed refresh coordination
plugin runtime state, plugin blobs, setup/migration bookkeeping, and installed-plugin indexes
device, node, pairing, sandbox, browser, and delivery queue state
cron job definitions, cron runtime state, and cron run history
task and Task Flow registries
commitments and small scoped key-value state
migration/backup metadata and shared cache rows

The global DB is intentionally not where large agent-local transcripts, VFS rows, or artifacts accumulate.

Per-Agent Database

~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite

Owns state that belongs to one agent/workspace:

session entries and route metadata
transcript events, snapshots, and checkpoint metadata
ACP/subagent run metadata
SQLite-backed VFS rows for agent scratch/workspace data
tool artifacts and run artifacts
agent-local runtime/cache rows that can later become searchable or indexable

The global database registers where each agent database lives. The agent database owns the write-heavy per-agent lane, so transcripts and artifacts do not become a global gateway bottleneck.

Schema And Access

The database schemas are SQL-first and generated into TypeScript from real temporary SQLite databases:

src/state/openclaw-state-schema.sql -> src/state/openclaw-state-db.generated.d.ts and src/state/openclaw-state-schema.generated.ts
src/state/openclaw-agent-schema.sql -> src/state/openclaw-agent-db.generated.d.ts and src/state/openclaw-agent-schema.generated.ts
owner-local schemas, such as proxy capture, keep their own SQL and generated files near the owning store

Runtime code uses Kysely over OpenClaw's native node:sqlite dialect. better-sqlite3 is used only by kysely-codegen for dev-time introspection; it is not a runtime driver. Runtime stores use typed Kysely queries, transactions, unique keys, conflict handling, and explicit row patches instead of whole-file mutation.

Refactor Scope

This branch moves or removes the file-era runtime paths across the codebase:

Session indexes move from sessions.json to per-agent session_entries rows.
Transcript runtime reads/writes move from JSONL tail/read/append paths to SQLite transcript tables.
Runtime session identity becomes { agentId, sessionKey }, not a legacy storePath or transcript locator.
Auth profiles move from auth-profiles.json / auth-state.json into SQLite-backed state, with doctor-owned import for legacy files.
Gateway session history, reset, compaction, route updates, ACP metadata, heartbeat-isolated runs, fast abort, subagent spawning, TUI session state, and auto-reply session updates use SQLite row helpers.
Channel/plugin runtime APIs carry session identity and SQLite-backed metadata instead of requiring callers to pass session-store file paths.
Plugin SDK surfaces expose database-backed session and plugin-state helpers; legacy path helpers are narrowed to migration/export/debug boundaries.
Memory search session indexing now uses transcript terminology end to end; host exports list/build session transcript helpers, targeted sync queues sessionTranscripts, and QMD/builtin indexers no longer expose session-file helper names.
ACPX service state uses a SQLite/plugin-state-backed ACP session store instead of the upstream file-backed runtime store.
Plugin/runtime ledgers, installed-plugin indexes, task/flow state, cron state, commitments, delivery queues, sandbox/browser registries, pairing/device state, model catalog/cache state, and small plugin caches move into SQLite-backed stores.
Extension-owned caches and state have explicit migration hooks where ownership belongs to the plugin, including Discord, Matrix, Microsoft Teams, Telegram, QQBot, Feishu, Nostr, iMessage, and related channel/plugin surfaces.
SQLite VFS, tool artifact, and run artifact stores give agents a database-backed workspace/scratch option.
Worker-prepared agent runs use serializable storage boundaries so Node workers can open their own database-backed VFS/cache/artifact stores while gateway-owned streaming, cancellation, and reply operations stay in the parent process.

Removed File-Era Machinery

SQLite replaces these old runtime concepts:

session file locks and lock doctor lanes
whole-store sessions.json rewrite queues
runtime JSON/JSONL import fallbacks
startup session repair and pruning side effects
transcript truncation repair and JSONL backup writes
disk-budget cleanup based on transcript file existence
transcript locators as a runtime bridge
path-shaped session APIs in gateway/channel/plugin hot paths
test-only sessions.json row-store shims that let new runtime tests keep pretending session state was a file
upstream ACPX file session stores; ACPX runtime session records now go through SQLite-backed plugin state
auth-profile file reads/writes in normal runtime paths; legacy auth files are doctor migration inputs
ad hoc plugin/channel cache JSON files where the data is runtime state

Migration Model

openclaw doctor --fix and openclaw migrate are the migration boundary.

Migration builds the global and per-agent SQLite databases from legacy inputs, verifies imported rows, records migration runs, and removes old source files only after successful import. Runtime code should not silently import, repair, prune, or rewrite legacy session/cache/auth files while handling normal gateway traffic.

Migration inputs include:

legacy sessions.json -> per-agent session rows
legacy transcript *.jsonl -> per-agent transcript events/snapshots
legacy auth-profiles.json, auth-state.json, auth.json, and OAuth credential files -> SQLite auth-profile state where applicable
legacy cron jobs.json, jobs-state.json, and run JSONL files -> shared cron tables
legacy task and Task Flow sidecar stores -> shared task/flow tables
legacy plugin-state sidecars and plugin JSON caches -> shared plugin state/blob tables or plugin-owned migration hooks
legacy sandbox/browser registry JSON -> shared registry tables
legacy channel/plugin caches -> plugin-owned SQLite state through setup/doctor migration hooks

Future state changes should add schema migrations and typed stores instead of adding new sidecar files.

Backup, Export, And Vacuum

Backups should be database-first archives:

include a compact snapshot of state/openclaw.sqlite
include every agents/<agentId>/agent/openclaw-agent.sqlite
include configuration, explicit credential source files, plugin manifests, and requested workspace/artifact exports
vacuum/compact database snapshots into one archive so restore has one obvious database-first path

Session export remains a support/debug/workspace-export feature, not a second canonical runtime state system.

VFS, Workers, And PI Boundary

Agent workspace state is designed for disk, hybrid, or SQLite-backed VFS operation. The per-agent database owns VFS rows, tool artifacts, run artifacts, and scoped caches so worker-prepared runs can receive a serializable storage boundary.

Worker execution stays experimental behind settings while the storage boundary settles. The intended shape is one worker per active run first; pooling can come later after lifecycle and database contention are proven.

This also continues internalizing PI behind OpenClaw-owned facades. Session state, transcript storage, filesystem behavior, worker preparation, runtime accounting, and provider/runtime contracts are represented through OpenClaw-owned stores and contracts instead of letting PI define the durable layout.

Review Guide

For each former file-backed state owner, ask: where does this data live now?

Acceptable homes are:

global SQLite database
per-agent SQLite database
explicit configuration/secret file
real user workspace file
migration-only legacy input
debug/export-only output
temporary scratch selected by the agent filesystem mode

Anything else is probably leftover file-era state and should be deleted, migrated, or renamed until the boundary is clear.

Latest Validation

Current Crabbox/Blacksmith Testbox run after the latest rebase/cleanup:

https://github.com/openclaw/openclaw/actions/runs/25626062293
status at this update: in progress

Recent local/remote validation during the cleanup pass:

pnpm check:database-first-legacy-stores
pnpm db:kysely:check
pnpm lint:kysely
pnpm tsgo:core
pnpm tsgo:extensions
git diff --check
pnpm test src/channels/plugins/bundled.shape-guard.test.ts
pnpm test extensions/matrix/src/migration-config.test.ts extensions/matrix/src/matrix/sdk/idb-persistence.test.ts
focused session/transcript/gateway/auth/OAuth/doctor/model-auth/secrets/Matrix/WhatsApp/reset Vitest shards

Recent Testbox findings were stale assumptions rather than new product design issues: imessage setup metadata missing a legacy migration discovery hint, Matrix tests leaking SQLite plugin state across cases, Matrix IndexedDB snapshot tests using async env-scoped state around SQLite, and channel/gateway fixtures that still asserted pre-SQLite file/session paths. Those are being cleaned up as part of the current pass before this PR is considered ready.

clawsweeper · 2026-05-06T19:15:41Z

Codex review: needs real behavior proof before merge.

Summary
The PR moves runtime/session/auth/plugin state from JSON, JSONL, sidecar files, and related repair paths into global and per-agent SQLite stores across gateway, CLI, macOS, SDK, docs, and tests.

Reproducibility: yes. for the blocking classes by source inspection: legacy transcript import replaces all session rows, and several doctor importers overwrite live SQLite state without merge or recency checks. I did not run a live populated-install upgrade in this read-only review.

Real behavior proof
Needs stronger real behavior proof before merge: Older discussion includes populated-install evidence for parts of the migration, but latest head still lacks sufficient redacted terminal/log/live output or screenshot/video proof of the after-fix real upgrade path; after updating the PR body, ClawSweeper should re-review automatically or a maintainer can comment @clawsweeper re-review.

Next step before merge
Protected labels, security-sensitive migration findings, and missing latest-head real behavior proof make this a maintainer/security sequencing item rather than a narrow automated repair lane.

Security
Needs attention: The diff moves durable auth, pairing, approvals, and plugin state into SQLite while legacy import paths can still overwrite live authorization or runtime state.

Review findings

[P1] Preserve Node 22 compatibility or split the runtime bump — package.json:1823
[P1] Merge legacy transcript imports instead of replacing rows — src/commands/doctor/state-migrations.ts:304-309
[P1] Skip stale exec approvals when SQLite state exists — src/commands/doctor/legacy/exec-approvals.ts:44-50

Review details

Best possible solution:

Keep the database-first storage direction under maintainer/security review, make legacy imports merge-safe and source-preserving, settle the Node support decision, and require latest-head real upgrade proof before merge.

Do we have a high-confidence way to reproduce the issue?

Yes, for the blocking classes by source inspection: legacy transcript import replaces all session rows, and several doctor importers overwrite live SQLite state without merge or recency checks. I did not run a live populated-install upgrade in this read-only review.

Is this the best way to solve the issue?

No, not as-is. The architectural direction is plausible, but the current patch needs merge-safe migration semantics, a deliberate Node support decision, and latest-head real behavior proof before it is the maintainable solution.

Full review comments:

[P1] Preserve Node 22 compatibility or split the runtime bump — package.json:1823
Current main declares Node >=22.16.0, but this PR raises the package engine to Node >=24.0.0. Existing supported installs can fail before they can run the SQLite migration unless maintainers explicitly approve and sequence a release-breaking runtime bump.
Confidence: 0.88
[P1] Merge legacy transcript imports instead of replacing rows — src/commands/doctor/state-migrations.ts:304-309
Each legacy transcript file is imported through replaceSqliteSessionTranscriptEvents, which deletes all existing rows for that session before inserting that file. Real installs can have multiple legacy files for one session, so earlier files or newer SQLite rows can be lost before the source JSONL is removed.
Confidence: 0.9
[P1] Skip stale exec approvals when SQLite state exists — src/commands/doctor/legacy/exec-approvals.ts:44-50
This importer writes the legacy approvals snapshot into the live SQLite KV row and removes the file without checking for existing state. A delayed doctor run can roll back newer approval policy already written through runtime paths.
Confidence: 0.88
[P1] Preserve current device auth during legacy import — src/commands/doctor/legacy/device-auth-store.ts:39-41
The legacy device-auth importer replaces the live SQLite auth snapshot and then removes the source file. If tokens were created, rotated, or revoked after upgrade but before doctor runs, this can restore stale tokens and make the rollback durable.
Confidence: 0.86
[P1] Preserve existing plugin state on migration conflict — src/plugin-sdk/migration-runtime.ts:55-60
The migration helper resolves plugin-state conflicts by replacing the existing row with the imported value. Because plugin_state_entries is live runtime state, delayed sidecar migration can overwrite newer plugin data instead of preserving or safely merging it.
Confidence: 0.84
[P1] Merge channel pairing imports with live state — src/commands/doctor/legacy/channel-pairing.ts:231-247
The channel-pairing importer replaces request and allowlist entries from legacy files and then removes the files. If SQLite already contains newer approvals or revocations, delayed doctor migration can re-add revoked senders or drop newly approved ones.
Confidence: 0.82

Overall correctness: patch is incorrect
Overall confidence: 0.88

Security concerns:

[high] Legacy exec approvals can restore stale policy — src/commands/doctor/legacy/exec-approvals.ts:44
importLegacyExecApprovalsFileToSqlite writes a legacy approvals snapshot over exec.approvals/current without checking for newer SQLite state, so delayed doctor runs can reintroduce stale exec decisions.
Confidence: 0.88
[high] Legacy device auth can restore revoked tokens — src/commands/doctor/legacy/device-auth-store.ts:39
importLegacyDeviceAuthFileToSqlite replaces the SQLite device-auth snapshot and removes the source file, which can roll token creation, rotation, or revocation state backward after runtime has already updated it.
Confidence: 0.86
[high] Channel pairing import can roll back access state — src/commands/doctor/legacy/channel-pairing.ts:231
The channel pairing importer replaces request and allowlist arrays from legacy files, so delayed doctor runs can re-add revoked senders or drop newly approved ones from the live SQLite pairing store.
Confidence: 0.82
[medium] Plugin migration conflicts clobber live plugin state — src/plugin-sdk/migration-runtime.ts:55
The plugin-state migration upsert uses imported values on conflict instead of preserving existing live rows, allowing stale sidecar data to overwrite current plugin state.
Confidence: 0.84

What I checked:

Protected labels: The provided GitHub context shows this open PR carries both security and maintainer labels, so cleanup automation should not close or auto-merge it without explicit maintainer handling. (eb5d4f6b0bc6)
Current main supports Node 22: Current main still declares Node >=22.16.0, while the PR raises the package engine to Node >=24.0.0. (package.json:1804, 115049753d59)
PR raises runtime engine: At PR head, package.json requires Node >=24.0.0, which is a release/support decision rather than an incidental storage refactor detail. (package.json:1823, eb5d4f6b0bc6)
Transcript import replaces rows: importLegacyTranscriptFileToSqlite calls replaceSqliteSessionTranscriptEvents, so multiple legacy files for one session or delayed import after SQLite writes can delete prior rows for that session. (src/commands/doctor/state-migrations.ts:304, eb5d4f6b0bc6)
Exec approvals import overwrites live KV: The legacy exec approvals importer writes the file snapshot into exec.approvals/current and removes the file without checking for existing SQLite state. (src/commands/doctor/legacy/exec-approvals.ts:44, eb5d4f6b0bc6)
Device auth import overwrites live auth snapshot: The legacy device-auth importer writes the parsed legacy token store into SQLite and then removes the source file, with no merge or recency check. (src/commands/doctor/legacy/device-auth-store.ts:39, eb5d4f6b0bc6)

Likely related people:

Peter Steinberger: The available current-main history and blame for sampled session/plugin migration files point to Peter Steinberger, and the provided PR commit list shows the SQLite refactor branch commits are authored by steipete. (role: recent area contributor and PR branch author; confidence: medium; commits: 0636bbb12455, 9444b2ad9b54, eb5d4f6b0bc6; files: src/config/sessions/transcript-append.ts, src/plugin-sdk/migration-runtime.ts, src/commands/doctor/state-migrations.ts)
jalehman: The PR discussion includes a detailed hardening prompt and compatibility review for the SQLite runtime boundary, including transaction, backup, migration, and companion transcript reader concerns. (role: reviewer / adjacent storage-hardening contributor; confidence: medium; files: src/state/openclaw-state-db.ts, src/config/sessions/transcript-store.sqlite.ts, src/commands/doctor/state-migrations.ts)
100yenadmin: The PR discussion maps companion-seam and correctness follow-up work to separate OpenClaw PRs and issues, making this person useful for routing the downstream compatibility lane rather than the security blockers. (role: follow-up implementer / routing contributor; confidence: medium; files: src/gateway/session-transcript-readers.ts, src/wizard/setup.migration-import.ts, src/auto-reply/reply/export-html/template.js)

Remaining risk / open question:

Protected security and maintainer labels require explicit maintainer handling before merge or close.
Delayed migration paths can overwrite newer SQLite authorization, pairing, plugin, and transcript state with stale legacy files.
The Node 24 engine bump may be intentional, but it needs explicit release/support sequencing because current main still supports Node 22.
The PR is very large, so the sampled blockers may not cover every remaining stale file-era migration path.
Latest-head real upgrade proof is still missing or insufficient for a branch that migrates durable runtime state.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 115049753d59.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c536838794

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T19:18:37Z

+  if (!isSqliteSessionStoreBackendEnabled(env)) {
+    return null;
+  }
+  const agentId = resolveAgentIdFromSessionStorePath(storePath) ?? DEFAULT_AGENT_ID;


Derive SQLite agent scope from configured store path

When session.store uses a non-canonical path (for example a custom template not shaped like .../agents/<id>/sessions/sessions.json), this code silently falls back to DEFAULT_AGENT_ID. After this commit moved runtime session reads/writes to SQLite, that fallback causes all such stores to be loaded/saved under the main SQLite partition, so non-main agent data can be mixed into the wrong store scope and agent-specific queries can return incorrect sessions.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-06T19:18:37Z

  const sessionsTargetDir = path.join(stateDir, "agents", targetAgentId, "sessions");
  const sessionsTargetStorePath = path.join(sessionsTargetDir, "sessions.json");
-  const legacySessionEntries = safeReadDir(sessionsLegacyDir);
+  const hasTargetJsonSessionStore = includeSessions && fileExists(sessionsTargetStorePath);


Detect legacy session JSON using session.store override

Doctor legacy migration detection is hardcoded to $STATE_DIR/agents/<defaultAgent>/sessions/sessions.json, so installs that previously persisted session metadata at a custom session.store path are skipped during openclaw doctor --fix. Because this commit removed runtime JSON session-store fallback, those users can lose visible session history after upgrading until they manually import from the custom path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c567352a42

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-06T21:27:11Z

+async function withTranscriptAppendQueue<T>(queueKey: string, fn: () => Promise<T>): Promise<T> {
+  const previous = transcriptAppendQueues.get(queueKey) ?? Promise.resolve();
+  let releaseCurrent!: () => void;
+  const current = new Promise<void>((resolve) => {
+    releaseCurrent = resolve;
+  });
+  const tail = previous.catch(() => undefined).then(() => current);
+  transcriptAppendQueues.set(queueKey, tail);
+  await previous.catch(() => undefined);


Restore cross-process locking for transcript appends

appendSessionTranscriptMessage now serializes writes only through the in-memory transcriptAppendQueues map, so concurrent writers from different Node processes are no longer coordinated. When a gateway process and another process (e.g., CLI/worker) append to the same agentId/sessionId at the same time, both can read the same pre-append event list and then insert independently, which can duplicate idempotent messages and produce conflicting parentId chains in SQLite transcripts.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb5d4f6b0b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-10T05:19:13Z

+  writePairingStateRecord({
+    baseDir,
+    subdir: "devices",
+    key: "bootstrap",
+    value: parsed as DeviceBootstrapState,


Skip stale device-bootstrap import when SQLite state exists

This migration always writes the legacy devices/bootstrap.json snapshot into pairing.devices/bootstrap and then removes the source file. Bootstrap tokens are live runtime state (issued/revoked/cleared via persistState in src/infra/device-bootstrap.ts), so a delayed openclaw doctor --fix can overwrite newer token/profile data with stale file-era content and make that rollback durable after deletion.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-10T05:19:13Z

+  writeSubagentRegistryRunsSnapshot(runs, env);
+  try {
+    fs.unlinkSync(pathname);


Avoid overwriting live subagent runs from legacy snapshot

The importer unconditionally writes legacy subagents/runs.json records into SQLite and then unlinks the file. Subagent rows are updated during normal lifecycle persistence (persistSubagentRunsToState/saveSubagentRegistryToState), so if doctor runs later, matching run_id rows can be rolled back to stale metadata (cleanup/outcome/announce fields) and the rollback becomes permanent once the legacy file is deleted.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-10T05:19:13Z

+    if (record && (await importLegacyManagedImageRecord(record, stateDir))) {
+      records += 1;
+    }
+    await fs.rm(recordPath, { force: true }).catch(() => {});


Keep managed-image record files when a row fails import

Each legacy managed-image record file is removed even when parsing fails or importLegacyManagedImageRecord returns false (for example, malformed JSON or missing source image). That permanently discards skipped record metadata, so operators cannot repair the bad rows and rerun migration to recover those attachments.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-10T05:19:13Z

+    .map((entry) => entry.name);
+  let imported = 0;
+  for (const fileName of files) {
+    const raw = await runsRoot.readText(fileName).catch(() => "");


Do not delete cron log files after a read failure

When reading a legacy cron run-log file fails, the code coerces content to "", imports zero entries, and still deletes the file. In transient I/O/permission error scenarios this silently loses the only recoverable log data, preventing a retry after fixing the underlying read issue.

Useful? React with 👍 / 👎.

100yenadmin · 2026-05-10T05:54:12Z

@steipete @jalehman One maintainer-readable, agent-runnable map for the companion-fit work around #78595.

Architecture decision

The right long-term shape is:

OpenClaw SQLite remains the canonical operational runtime store
Lossless Claw / LCM remains a separate downstream companion DB
OpenClaw exposes a few small generic read/discovery/projection seams
LCM adapts to those seams instead of pushing its own summary/context/GC tables into OpenClaw core

That keeps the boundary clean:

OpenClaw owns runtime truth, session continuity policy, and transcript replay semantics
LCM owns richer downstream identity, message_parts, summaries, recall, and maintenance behavior

Why this is better for OpenClaw, not just for LCM

future first-party memory/search/export/audit features get stable read seams instead of private one-off glue
reset/rotation/multi-entity continuity stays core-owned and consistent
advanced consumers do not have to scan blobs or reimplement private active-branch logic
the database-first runtime stays canonical instead of pulling JSONL/file identity back into core

Current stack map

flowchart LR
    A["#78595 database-first runtime"] --> B["#79971 runtime-truth correctness"]
    A --> C["#79970 selection seam"]
    A --> D["#79972 replay seam"]
    B --> E["#79905 typed projection seam"]
    C --> E
    D --> E
    D --> F["LCM #646 SQLite frontier"]
    C --> G["LCM #642 identity mapping"]
    D --> H["LCM #643 message_parts reconstruction"]
    H --> I["LCM #644 rotate/checkpoint/GC remap"]

What is already implemented and why it exists

1. `#79971` — runtime-truth correctness follow-up

PR: #79971

Why it exists:

OpenClaw runtime truth needed to be correct before any companion layering mattered
stale JSONL-era assumptions were still leaking into setup, doctor, export, and lightweight transcript readers

What it does:

SQLite-aware onboarding freshness
SQLite-as-truth doctor behavior
canonical session header export
safer rotated checkpoint cleanup
active-branch-safe recent reader behavior

Direct review follow-up/proof:

PR comment

2. `#79970` — canonical session selection seam

PR: #79970

Why it exists:

sessionId selection policy and ambiguity handling should not be reimplemented by every caller
OpenClaw should own canonical selection semantics without absorbing LCM’s richer lineage storage model

What it does:

exposes SessionIdMatchSelection
adds richer run-key selection
preserves active run-context semantics

Direct review follow-up/proof:

PR comment

3. `#79972` — canonical replay seam

PR: #79972

Why it exists:

file offsets / mtime are not a durable replay contract in a database-first runtime
consumers need a typed frontier/delta seam over canonical SQLite transcript ownership

What it does:

exposes transcript frontier/delta helpers
keeps the seam additive
restores compatibility exports on the public subpath
keeps raw DB handles off the public seam
forces reset on same-millisecond rewrites via a monotonic write floor

Direct review follow-up/proof:

PR comment

4. `lossless-claw#646` — first downstream proof consumer

PR: Martian-Engineering/lossless-claw#646

Why it exists:

it proves the architecture is executable, not just theoretical
LCM can now bootstrap/resume from a SQLite frontier while keeping JSONL fallback for older hosts

What remains

Remaining upstream OpenClaw seam

#79905 — typed transcript projections/helpers plus rebuild contract

Why this is still needed:

LCM should not have to duplicate active-branch traversal and raw event archaeology forever
OpenClaw already has the right internal raw materials, but the reusable typed seam is still missing

Remaining downstream LCM work

#642 — family + segment + runtime-session
#643 — message_parts and tool/result reconstruction
#644 — rotate/checkpoint/GC remap

Confidence read

After the latest implementation + re-verification loop:

the live review findings on #79970, #79971, and #79972 are addressed and pushed
each PR body now includes direct runtime proof, not only unit tests
I’m above 95% confident there is no additional hidden upstream OpenClaw core/schema parity survivor beyond #79905

That is not the same as saying the whole stack is merged or review-clean yet.
It means the remaining work now looks like known work, not hidden architecture debt.

Agent-ready routing

If Peter or another agent is going to keep moving this refactor, this is the clean routing:

Keep separate

correctness/security review lane on this PR:
- agent-ready correctness comment on #78595
OpenClaw seam lane:
LCM lane:

Recommended order

Keep the direct correctness/security review lane on #78595 separate.
Carry forward / merge-hardening #79971.
Carry forward / merge-hardening #79970.
Carry forward / merge-hardening #79972.
Implement #79905.
Continue downstream LCM work in order: #642 -> #643 -> #644.

What not to do

do not pull JSONL/file-locator identity back into runtime APIs
do not absorb LCM summary/context/GC tables into OpenClaw core
do not blend the #78595 correctness/security lane into the seam-feature lane

If useful, I can turn this same map into one shorter “send this to Peter” checklist comment next, but this is the full maintainer/agent handoff version.

100yenadmin · 2026-05-10T06:34:07Z

Final implementation map for the companion-fit / Lossless Claw migration stack.

Architecture decision remains:

OpenClaw SQLite stays the canonical operational runtime store
Lossless Claw stays a downstream companion DB that ingests and projects from OpenClaw's canonical state

Implemented OpenClaw slices:

openclaw/openclaw#79971 — correctness fixes on the refactor lane
openclaw/openclaw#79970 — lineage / discovery seam
openclaw/openclaw#79972 — frontier / delta seam
100yenadmin/openclaw#1 — typed projection / rebuild seam stacked cleanly on the frontier seam

Implemented LCM slices:

Martian-Engineering/lossless-claw#646 — SQLite frontier
100yenadmin/lossless-claw#3 — family / segment / runtime-session identity
100yenadmin/lossless-claw#1 — bootstrap message_parts fidelity
100yenadmin/lossless-claw#2 — canonical SQLite maintenance scope and compaction-successor continuity

Why some of the later implementation PRs live in the forks:

GitHub would allow cross-repo PRs into the upstream repos when the base is main, but the create API rejected fork-only stacked baseRefNames.
Publishing these as clean stacked fork PRs preserves the exact diff each agent should work from.
That is better than opening inflated upstream PRs that silently re-include predecessor slices.

Recommended execution/review order:

#79971
#79970
#79972
100yenadmin/openclaw#1
lossless-claw#646
100yenadmin/lossless-claw#3
100yenadmin/lossless-claw#1
100yenadmin/lossless-claw#2

For Peter / agent handoff:

use the PR bodies first, because they now contain the why, boundaries, diagrams, validation commands, and stack placement
use the upstream umbrella issues and #78595 for routing
do not collapse OpenClaw into an LCM-shaped core schema
do not treat LCM as the raw transcript owner once SQLite is canonical

steipete requested a review from a team as a code owner May 6, 2026 19:12

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

openclaw-barnacle Bot added the extensions: memory-core Extension: memory-core label May 6, 2026

chatgpt-codex-connector Bot reviewed May 6, 2026

View reviewed changes

steipete added 24 commits May 10, 2026 06:05

refactor: stop caching cli image files

7ea66fc

refactor: keep diagnostics in state sqlite

e731d04

refactor: rename transcript entries away from files

731040d

refactor: stop exporting session jsonl from html viewer

57ba49c

test: ban session jsonl export bridges

1779856

refactor: keep matrix legacy mutations out of runtime barrel

2ba3fc2

refactor: remove live backup volatile filter

a92a548

refactor: move legacy session aliases to doctor

6d5c2b5

refactor: route pi imports through contracts

5e0a4fe

refactor: rename model catalog write queue

49cb124

refactor: store command logger entries in sqlite

6b87e9b

refactor: remove pairing allowlist file paths

283bd7b

refactor: store macos exec approvals in sqlite

cc6388a

refactor: store macos port guardian state in sqlite

af368c4

test: remove exec approval file fixtures

afef7dc

refactor: rename exec approval document resolver

3605f9d

test: stop naming exec approvals json as active state

5368bb0

refactor: require node 24 for sqlite state runtime

4abf1af

docs: track node 24 sqlite runtime contract

84da79d

refactor: move provider caches into sqlite state

8a4023b

test: ban restored json credential stores

e269d19

fix: keep legacy device identity import in doctor

ff33111

test: ban retired auth and model json stores

ae1d3d9

refactor: drop legacy auth json secret audit

eb5d4f6

chatgpt-codex-connector Bot reviewed May 10, 2026

View reviewed changes

This was referenced May 10, 2026

feat: model LCM family and segment continuity 100yenadmin/lossless-claw#3

Open

feat: add typed SQLite transcript projections and rebuild helpers 100yenadmin/openclaw#1

Open

100yenadmin mentioned this pull request May 10, 2026

Next version: map Lossless Claw to OpenClaw database-first SQLite runtime Martian-Engineering/lossless-claw#641

Open

Uh oh!

Conversation

steipete commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Current State

Reviewer Mental Model

Database Layout

Global State Database

Per-Agent Database

Schema And Access

Refactor Scope

Removed File-Era Machinery

Migration Model

Backup, Export, And Vacuum

VFS, Workers, And PI Boundary

Review Guide

Latest Validation

Uh oh!

clawsweeper Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

100yenadmin commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Architecture decision

Why this is better for OpenClaw, not just for LCM

Current stack map

What is already implemented and why it exists

1. #79971 — runtime-truth correctness follow-up

2. #79970 — canonical session selection seam

3. #79972 — canonical replay seam

4. lossless-claw#646 — first downstream proof consumer

What remains

Remaining upstream OpenClaw seam

Remaining downstream LCM work

Confidence read

Agent-ready routing

Keep separate

Recommended order

What not to do

Uh oh!

100yenadmin commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

steipete commented May 6, 2026 •

edited

Loading

clawsweeper Bot commented May 6, 2026 •

edited

Loading

100yenadmin commented May 10, 2026 •

edited

Loading

1. `#79971` — runtime-truth correctness follow-up

2. `#79970` — canonical session selection seam

3. `#79972` — canonical replay seam

4. `lossless-claw#646` — first downstream proof consumer