fix: gate startup backfills with migration state#380
Conversation
Add versioned startup backfill state so the expensive summary and tool-call repairs only run once per algorithm version. Keep retry safety by wrapping each versioned backfill and its completion marker in a savepoint so a failed startup rolls back partial backfill writes and reruns cleanly on the next launch. Regeneration-Prompt: | Implement the startup backfill gating change in lossless-claw without using PRAGMA user_version or column-existence guesses as the completion signal. Add an additive SQLite table keyed by backfill step name and algorithm version, and only skip a backfill after that exact version completes. Preserve partial-upgrade safety by making the backfill work and state write succeed or roll back together, then cover first-run state creation, repeat startup skipping, and retry-after-failure behavior in migration tests. This runtime change affects package behavior, so include a patch changeset.
|
Confirming this fixes a real-world crash loop documented in #383. Environment: lossless-claw v0.8.0, OpenClaw 2026.4.9, macOS arm64, ~22K messages / 510 summaries / 176MB database. Symptoms: Every prompt triggered a 1–2 minute hang followed by Validation: A simpler version of this fix (early-return guards checking Looking forward to this shipping so I can drop the local patch. 🦞 |
What
This PR adds additive startup backfill state to the SQLite migration path so the expensive summary and tool-call backfills stop rerunning after a successful startup, while still retrying cleanly if startup fails before the backfill version is marked complete.
Why
Lossless-claw was rerunning heavy startup backfills on every launch. That added avoidable startup work and still lacked an explicit completion signal for partial-upgrade recovery.
Changes
lcm_migration_statefor versioned backfill completionTesting
npm test -- test/migration.test.tstest/migration.test.tsto pass