Skip to content

[migration] DB transaction hardening: withTransaction reentrancy + split bulk migration (C/8)#579

Open
100yenadmin wants to merge 3 commits into
Martian-Engineering:mainfrom
electricsheephq:pr9/migration-hardening-withtransaction
Open

[migration] DB transaction hardening: withTransaction reentrancy + split bulk migration (C/8)#579
100yenadmin wants to merge 3 commits into
Martian-Engineering:mainfrom
electricsheephq:pr9/migration-hardening-withtransaction

Conversation

@100yenadmin

Copy link
Copy Markdown
Collaborator

Summary

Closes #569. Two related DB-correctness fixes for the v0.9.3 stability sweep.

D2 — withTransaction reentrancy regression test (#474)

Issue #474 reports SqliteError: cannot start a transaction within a transaction from ConversationStore.withTransaction() 4-6×/hour under multi-bridge load. withTransaction already routes through the per-connection async mutex from src/transaction-mutex.ts (added in #261), but #474 shows the regression surface needs explicit coverage.

This PR adds a setImmediate-interleaved stress test that ticks the event loop between every step inside the transaction body, forcing aggressive concurrency. Under the current wiring all transactions complete cleanly — without the mutex the test would reproduce the SqliteError reliably.

D3 — split bulk migration exec (#569 + #482 root cause)

PR #482 added ensureMessagePartsTable as belt-and-suspenders against Node db.exec()'s silent-abort failure mode. The same exposure existed for every other table in the same bulk block (summaries, summary_messages, summary_parents, large_files, conversation_*, lcm_migration_state, plus their indexes).

This PR (option 2 from the audit) splits the bulk db.exec() into per-statement calls held in a LCM_INITIAL_SCHEMA_STATEMENTS array, iterated under the existing BEGIN EXCLUSIVE wrapper from #455. Each per-table SQL error throws on its own — silent abort is no longer possible.

A regression test simulates a SQL failure on summary_messages and asserts runLcmMigrations now throws. A second test verifies every table from the previous bulk block is still created independently.

Files changed

  • src/db/migration.ts — extracted bulk db.exec() block (lines ~815-1003) into LCM_INITIAL_SCHEMA_STATEMENTS array (~17 statements), iterated per-statement inside the existing BEGIN EXCLUSIVE wrapper.
  • test/migration.test.ts — updated existing message_parts belt-and-suspenders test to match per-statement interception, plus two new tests for split-bulk behavior.
  • test/transaction-mutex.test.ts — added setImmediate-interleaved regression test for withTransaction() reentrant callers throw "cannot start a transaction within a transaction" #474.
  • .changeset/pr9-migration-hardening.md — changeset (patch).

Test plan

Refs #474, #455, #261, #482.

…ulk migration

Two DB-correctness fixes:

* withTransaction reentrancy (Martian-Engineering#474): the async function awaited
  operation() between BEGIN IMMEDIATE and COMMIT, letting concurrent
  callers race into a mid-transaction BEGIN.  ConversationStore.withTransaction
  already routes through the per-connection async mutex from
  src/transaction-mutex.ts (PR Martian-Engineering#261), but Martian-Engineering#474 shows the regression
  surface needs explicit coverage.  Add a setImmediate-interleaved
  stress test that ticks the event loop between every step inside the
  transaction body so concurrent callers race aggressively — under the
  current wiring all transactions complete cleanly; without it the
  test reproduces the SqliteError from the issue.

* runLcmMigrations bulk-exec split (Martian-Engineering#569): PR Martian-Engineering#482 added
  ensureMessagePartsTable as belt-and-suspenders against Node
  db.exec()'s silent-abort failure mode.  The same exposure existed
  for every other table in the same bulk block (summaries,
  summary_messages, summary_links, lcm_migration_state, plus their
  indexes).  Split the bulk exec into per-statement calls (still
  inside the BEGIN EXCLUSIVE wrapper from Martian-Engineering#455) so each per-table SQL
  error throws on its own instead of being swallowed.  Add a test
  that simulates a SQL failure on summary_messages and verifies
  runLcmMigrations now throws.

Closes Martian-Engineering#569.  Refs Martian-Engineering#474, Martian-Engineering#455, Martian-Engineering#261, Martian-Engineering#482.
Copilot AI review requested due to automatic review settings May 3, 2026 14:24

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens SQLite transaction/migration correctness by adding regression coverage for withTransaction reentrancy under aggressive async interleaving, and by refactoring initial schema creation to avoid multi-statement db.exec() silent-abort behavior during migrations.

Changes:

  • Add a setImmediate-interleaved stress test to guard against withTransaction() reentrancy errors under concurrent async load (#474).
  • Split the initial schema bulk db.exec() into per-statement db.exec() calls via LCM_INITIAL_SCHEMA_STATEMENTS to ensure failures surface immediately (#569).
  • Expand/adjust migration tests to validate split-bulk failure behavior and independent table creation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/db/migration.ts Introduces LCM_INITIAL_SCHEMA_STATEMENTS and iterates per statement inside the existing BEGIN EXCLUSIVE migration wrapper.
test/migration.test.ts Updates the message_parts guard test for per-statement exec and adds split-bulk regression tests.
test/transaction-mutex.test.ts Adds an event-loop-interleaving stress test to detect nested-transaction errors under concurrency.
.changeset/pr9-migration-hardening.md Adds a patch changeset documenting the two hardening changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/db/migration.ts Outdated
Comment thread src/db/migration.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

`messages.UNIQUE(conversation_id, seq)` and
`context_items.PRIMARY KEY (conversation_id, ordinal)` already create
implicit indexes on those exact columns; the explicit
`messages_conv_seq_idx` and `context_items_conv_idx` were pure
write/maintenance overhead.

Removed from `LCM_INITIAL_SCHEMA_STATEMENTS` so fresh installs no
longer create them, and added `DROP INDEX IF EXISTS` cleanup steps so
existing databases also shed the redundancy on next migration.

Addresses code review on Martian-Engineering#579.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/db/migration.ts
Comment thread test/migration.test.ts Outdated
@100yenadmin 100yenadmin changed the title [migration] DB transaction hardening: withTransaction reentrancy + split bulk migration [migration] DB transaction hardening: withTransaction reentrancy + split bulk migration (C/8) May 3, 2026
…c failure mode

* src/db/migration.ts: `ensureMessagePartsTable` JSDoc now explains
  the post-Martian-Engineering#569 split-bulk approach exec's statements individually, so
  the legacy silent-abort failure mode no longer applies on new
  installs.  The guard remains as defense-in-depth for legacy DBs
  upgraded across the silent-abort window.  Inline comment at the
  registration site rewritten to match.
* test/migration.test.ts: drop the `PR #9` reference (an internal
  agent-generated PR-numbering artifact); reword to PR-agnostic
  phrasing and reference the actual issue/PR numbers (Martian-Engineering#569, Martian-Engineering#482)
  for the migration change.

No code change.  Addresses review on Martian-Engineering#579.
@100yenadmin 100yenadmin added the linked-pr Has an identified PR or merge candidate label May 31, 2026
@100yenadmin

Copy link
Copy Markdown
Collaborator Author

Maintainer review: needs rebase before deeper review.

Summary
@100yenadmin, this PR is still potentially useful, but the current head a372ac0e4ca526232054e8bf31b2ebf61514f225 is CONFLICTING / DIRTY against main, so it cannot be reviewed for merge-readiness yet.

Next step
Please rebase or replace this branch against current main, then refresh CI. After it is clean, we can do the actual bug/priority review and decide whether it should stay in the P1/P2 lane.

Evidence checked
  • PR title: [migration] DB transaction hardening: withTransaction reentrancy + split bulk migration (C/8)
  • Live state: open, mergeable=CONFLICTING, mergeStateStatus=DIRTY.
  • Current labels: bug, priority:P2, linked-pr
  • No existing maintainer-mode rebase marker was present before this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working linked-pr Has an identified PR or merge candidate priority:P2 Important bug, compatibility, performance, or reliability issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[migration] DB transaction hardening: withTransaction reentrancy + ensureXTable generalization

2 participants