fix(dolt): distinguish fsck open-failures from integrity failures#3465
Merged
maphew merged 1 commit intoApr 25, 2026
Merged
Conversation
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
prePushFSCK previously wrapped any dolt fsck error as ErrDanglingReference with 'aborting push to prevent propagating corrupt chunks', including cases where fsck could not even open the database (environmental / tooling issues, not integrity problems). This misled users into thinking their healthy databases were corrupt. Concrete example: dolthub/dolt#10915 (Windows url.Parse bug, pre-v1.86.4) caused fsck to construct a malformed file path and fail to open; users hitting this saw the misleading 'dangling chunk reference' error from bd. Now detect the two known 'couldn't open' signatures from dolt and log a warning instead of aborting. Real integrity failures (dangling chunks in an openable db) still abort as before. Fixes gastownhall#3464
c839ed5 to
8954939
Compare
maphew
approved these changes
Apr 25, 2026
maphew
left a comment
Collaborator
There was a problem hiding this comment.
Reviewed the fsck classification path and ran local targeted validation: go test -tags gms_pure_go -run 'TestPrePushFSCK|TestFsckCouldNotOpen' ./internal/storage/dolt/. This preserves the real integrity-failure abort path while avoiding a misleading corruption error when fsck cannot open the database at all.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
prePushFSCK(added in #3447) wraps anydolt fsckerror asErrDanglingReferencewith the messageaborting push to prevent propagating corrupt chunks. That phrasing implies the local database is corrupt — but fsck can fail for environmental reasons that have nothing to do with integrity, and wrapping those as corruption misleads users.Root Cause
prePushFSCKatinternal/storage/dolt/store.go:1907-1925treats all non-zero fsck exits identically. No distinction between:Concrete Trigger (resolved upstream, still worth hardening)
dolthub/dolt#10915 (shipped in dolt v1.86.4, 2026-04-22) —
fsck.gousedurl.Parseto parse the database file URL; on Windows this placed the drive letter into the URL Host field rather than Path, causingdbfactory/file.goto construct an invalid\C:\...path. Every Windows bd user on dolt <1.86.4 running post-#3447 bd hit this:bd dolt pushreturned a "dangling chunk reference" error on perfectly healthy databases.The dolt bug is fixed. This PR hardens beads against the class of failure mode so future dolt/bd version mismatches don't produce misleading corruption warnings.
Fix
Distinguish "fsck couldn't run" from "fsck found problems" by matching the two known dolt phrasings that mean the check never executed:
"Could not open dolt database"— the url.Parse bug symptom (and any other open failure)"repository state is invalid"— uninitialized or partial.doltdirectoryFor those cases, log a warning and proceed. For any other fsck failure, abort as before. Real dangling-reference errors still block propagation; only misleading wrappings of environmental failures are changed.
Test Plan
Updated tests:
TestPrePushFSCK_UnopenableDB(renamed fromTestPrePushFSCK_CorruptNoms) — simulates the unopenable state (.dolt/nomspresent, nodolt init) and verifiesprePushFSCKreturns nil + logs a warning rather than wrapping asErrDanglingReference.TestFsckCouldNotOpen(new) — table test covering both known "couldn't open" phrasings, a real dangling-reference string (must not match), and an empty string.All previously-passing fsck tests continue to pass.
Context
Fixes #3464.
Root cause in dolt: dolthub/dolt#10915 (fixed in v1.86.4).
Full discovery thread:
#beadschannel on Dolt Discord, 2026-04-24 — @macneale and @elianddb identified the dolt fix within minutes of the symptom being surfaced.Scope Guards
prePushFSCK's core intent or#3447's safety-check goalprePushFSCK