fix(state): make fork migration parent links deterministic#2476
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
| assert_eq!(messages[0].parent_entry_id, None); | ||
| assert_eq!(messages[1].parent_entry_id, Some(messages[0].id)); | ||
| assert_eq!(messages[2].parent_entry_id, Some(messages[1].id)); | ||
| assert_eq!(messages[3].parent_entry_id, Some(messages[2].id)); |
There was a problem hiding this comment.
Missing idempotency re-open check
The original init_schema_migration test ends with a second StateStore::open call to verify the migration is safe to run on an already-migrated database. The new same-second test omits this step, so a regression where re-opening a migrated store causes a double-ALTER TABLE or index conflict would not be caught here. Adding StateStore::open(Some(path.clone())).expect("open state store - idempotent"); after the assertions would close this gap.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Harvested into I cherry-picked the state migration patch and added one local follow-up test commit covering same-second migration idempotency on reopen, matching the review concern. I’m leaving this PR open until the triage branch itself is reviewed/merged so GitHub doesn’t lose the contributor thread. |
|
Thanks @cyq1017 — the fork parent-link migration fix is patch-equivalent to work already present on The integration map records this as harvested/present, with your contribution credited. Closing this PR as superseded by the already-landed patch-equivalent implementation. I am not closing issue #2082 here; that should only close after confirming the remaining |
Problem
#2082's main fork support is already in place, but the follow-up review noted two remaining blockers: legacy migration can miss parent links when messages share the same second-level
created_at, andtest_forkstill has a straydbg!(&thread).Change
parent_entry_idusing(created_at, id)ordering, so same-second legacy messages keep deterministic parent links.test_fork.Fixes #2082.
Verification
cargo fmt --all -- --checkcargo test -p codewhale-state --test parity_state --locked -- --nocapturecargo check -p codewhale-state --lockedgit diff --check HEAD~1..HEADGreptile Summary
This PR fixes a determinism bug in the legacy schema migration: when multiple messages share the same
created_attimestamp, the previousORDER BY m2.id DESCwithout a matchingWHEREfilter could assign any preceding message as the parent rather than the immediately preceding one. The fix adds a compound(created_at, id)ordering to both the filter predicate and theORDER BY, ensuring the parent link is always the closest lower-id message in the same second.crates/state/src/lib.rs: Updates the one-time migration UPDATE forparent_entry_idto use a composite(created_at ASC, id ASC)tiebreaker, fixing non-deterministic parent assignment for same-second messages.crates/state/tests/parity_state.rs: Addsinit_schema_migration_same_second_messagesto regression-test the four-message same-second scenario, and removes the straydbg!(&thread)fromtest_fork.Confidence Score: 4/5
Safe to merge; the migration SQL fix is logically correct and the new regression test validates the happy path.
The composite (created_at, id) predicate and ORDER BY correctly handle same-second messages, and the recursive list_messages CTE will walk the resulting parent chain in the right order. The only gap is that the new test skips the idempotency re-open check that the existing migration test performs, so a double-ALTER regression on an already-migrated database would not be caught by this test alone.
crates/state/tests/parity_state.rs — the new test would benefit from an idempotency re-open check mirroring the one in init_schema_migration.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[StateStore::open - user_version = 0] --> B[Run migration SQL] B --> C[ALTER TABLE messages\nADD COLUMN parent_entry_id] C --> D{For each message M\nin thread} D --> E["Find predecessor:\nm2.created_at < M.created_at\nOR\n(m2.created_at = M.created_at AND m2.id < M.id)"] E --> F["ORDER BY m2.created_at DESC, m2.id DESC\nLIMIT 1"] F --> G[SET parent_entry_id = m2.id or NULL if no predecessor] G --> D D --> H[CREATE INDEX idx_messages_parent_entry_id] H --> I[UPDATE threads SET current_leaf_id = highest message id per thread] I --> J[PRAGMA user_version = 1] J --> K[Migration complete]Reviews (1): Last reviewed commit: "fix(state): stabilize fork migration par..." | Re-trigger Greptile