Fix concurrent SetSecret calls silently clobbering each other#4345
Merged
Fix concurrent SetSecret calls silently clobbering each other#4345
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4345 +/- ##
==========================================
- Coverage 68.45% 68.38% -0.08%
==========================================
Files 479 479
Lines 48642 48669 +27
==========================================
- Hits 33300 33281 -19
- Misses 12373 12390 +17
- Partials 2969 2998 +29 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
amirejaz
reviewed
Mar 24, 2026
Contributor
amirejaz
left a comment
There was a problem hiding this comment.
Solid fix — the read-modify-write pattern inside the lock and the processLocks in-process mutex address the root cause cleanly. The Stat() removal is correct: readFileSecrets replaces stat.Size() > 0 with len(data) == 0 after os.ReadFile, and also handles the non-existent file case. A few gaps below.
EncryptedManager contained a TOCTOU race: it loaded the secrets file into an in-memory map at construction time and later wrote the stale map back to disk under the file lock, silently overwriting changes made by other processes (e.g. OAuth token refreshes running in background proxy processes). The fix has two parts: 1. Read-modify-write inside the lock (pkg/secrets/encrypted.go): SetSecret, DeleteSecret, and Cleanup now re-read and decrypt the file from disk inside the critical section before applying their mutation and writing back. This eliminates the stale-snapshot problem across both processes and goroutines. 2. In-process mutex in WithFileLock (pkg/fileutils/lock.go): flock(2) does not provide mutual exclusion between different file descriptors within the same process. Added a per-path sync.Mutex that serializes goroutines before acquiring the flock, so the cross-process lock remains effective for its intended purpose. Fixes #4339 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Evict stale cache entry in DeleteSecret when key is gone from disk - Add design comments on cache-lag windows in SetSecret/DeleteSecret - Document GetSecret/ListSecrets intentional stale-cache behavior - Document processLocks unbounded growth trade-off - Use Load-first pattern in getProcessLock to avoid steady-state allocs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a3c8eb0 to
beb1198
Compare
amirejaz
approved these changes
Mar 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EncryptedManagerhad a TOCTOU race: it loaded the secrets file at construction time and wrote that stale snapshot back under the file lock, silently overwriting changes from other processes. In practice, OAuth token refreshes from long-running proxy processes would clobber secrets set by concurrentthv secret setCLI invocations (and vice versa), causing ~8% secret loss under contention.SetSecret,DeleteSecret,Cleanup) now re-read the file from disk inside the critical section before applying changes, eliminating the stale-snapshot problem.sync.MutextoWithFileLockbecauseflock(2)does not provide mutual exclusion between different file descriptors within the same process.Fixes #4339
Type of change
Test plan
task test) — existingTestEncryptedManager_Concurrencynow passes reliably (previously flaky); ran 50 iterations with zero failuresgo vetpasses on both changed packagesChanges
pkg/secrets/encrypted.goSetSecret/DeleteSecret/Cleanupnow read-modify-write inside the lock instead of using a stale in-memory snapshot; extractedreadFileSecrets/writeFileSecretshelpers; simplifiedNewEncryptedManagerto reusereadFileSecretspkg/fileutils/lock.gosync.Mutexregistry (processLocks) soWithFileLockserializes both across processes (flock) and within the same process (mutex)Does this introduce a user-facing change?
Yes —
thv secret setwill no longer silently lose secrets when other processes (e.g. OAuth token refreshes) write to the secrets file concurrently. Users who previously needed retry-based workarounds (#4339) should no longer experience secret loss.Special notes for reviewers
syncmap.Mapcache is now updated with targetedStore/Deletecalls (not a full cache replacement) to avoid a window where concurrentGetSecretreads see an empty cache. This means the cache may not reflect keys added by other processes, but that's acceptable: CLI one-shots create a fresh manager per invocation, and long-running proxies only need their own tokens.readFileSecretshandles empty files and missing files gracefully, returning an empty map rather than erroring.DeleteSecretexistence check moved inside the lock and now checks the on-disk state, not the potentially-stale in-memory cache.Generated with Claude Code