Skip to content

Fix concurrent SetSecret calls silently clobbering each other#4345

Merged
JAORMX merged 2 commits intomainfrom
fix-encrypted-secrets-toctou
Mar 25, 2026
Merged

Fix concurrent SetSecret calls silently clobbering each other#4345
JAORMX merged 2 commits intomainfrom
fix-encrypted-secrets-toctou

Conversation

@JAORMX
Copy link
Copy Markdown
Collaborator

@JAORMX JAORMX commented Mar 24, 2026

Summary

  • EncryptedManager had a TOCTOU race: it loaded the secrets file at construction time and wrote that stale snapshot back under the file lock, silently overwriting changes from other processes. In practice, OAuth token refreshes from long-running proxy processes would clobber secrets set by concurrent thv secret set CLI invocations (and vice versa), causing ~8% secret loss under contention.
  • Mutations (SetSecret, DeleteSecret, Cleanup) now re-read the file from disk inside the critical section before applying changes, eliminating the stale-snapshot problem.
  • Added a per-path in-process sync.Mutex to WithFileLock because flock(2) does not provide mutual exclusion between different file descriptors within the same process.

Fixes #4339

Type of change

  • Bug fix

Test plan

  • Unit tests (task test) — existing TestEncryptedManager_Concurrency now passes reliably (previously flaky); ran 50 iterations with zero failures
  • Manual testing — verified go vet passes on both changed packages

Changes

File Change
pkg/secrets/encrypted.go SetSecret/DeleteSecret/Cleanup now read-modify-write inside the lock instead of using a stale in-memory snapshot; extracted readFileSecrets/writeFileSecrets helpers; simplified NewEncryptedManager to reuse readFileSecrets
pkg/fileutils/lock.go Added per-path sync.Mutex registry (processLocks) so WithFileLock serializes both across processes (flock) and within the same process (mutex)

Does this introduce a user-facing change?

Yes — thv secret set will no longer silently lose secrets when other processes (e.g. OAuth token refreshes) write to the secrets file concurrently. Users who previously needed retry-based workarounds (#4339) should no longer experience secret loss.

Special notes for reviewers

  • The in-memory syncmap.Map cache is now updated with targeted Store/Delete calls (not a full cache replacement) to avoid a window where concurrent GetSecret reads see an empty cache. This means the cache may not reflect keys added by other processes, but that's acceptable: CLI one-shots create a fresh manager per invocation, and long-running proxies only need their own tokens.
  • readFileSecrets handles empty files and missing files gracefully, returning an empty map rather than erroring.
  • The DeleteSecret existence check moved inside the lock and now checks the on-disk state, not the potentially-stale in-memory cache.

Generated with Claude Code

@github-actions github-actions bot added the size/S Small PR: 100-299 lines changed label Mar 24, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 24, 2026

Codecov Report

❌ Patch coverage is 46.96970% with 35 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.38%. Comparing base (074326e) to head (beb1198).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
pkg/secrets/encrypted.go 42.10% 11 Missing and 22 partials ⚠️
pkg/fileutils/lock.go 77.77% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4345      +/-   ##
==========================================
- Coverage   68.45%   68.38%   -0.08%     
==========================================
  Files         479      479              
  Lines       48642    48669      +27     
==========================================
- Hits        33300    33281      -19     
- Misses      12373    12390      +17     
- Partials     2969     2998      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX requested a review from amirejaz March 24, 2026 14:48
Copy link
Copy Markdown
Contributor

@amirejaz amirejaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid fix — the read-modify-write pattern inside the lock and the processLocks in-process mutex address the root cause cleanly. The Stat() removal is correct: readFileSecrets replaces stat.Size() > 0 with len(data) == 0 after os.ReadFile, and also handles the non-existent file case. A few gaps below.

JAORMX and others added 2 commits March 25, 2026 05:49
EncryptedManager contained a TOCTOU race: it loaded the secrets file
into an in-memory map at construction time and later wrote the stale
map back to disk under the file lock, silently overwriting changes made
by other processes (e.g. OAuth token refreshes running in background
proxy processes).

The fix has two parts:

1. Read-modify-write inside the lock (pkg/secrets/encrypted.go):
   SetSecret, DeleteSecret, and Cleanup now re-read and decrypt the
   file from disk inside the critical section before applying their
   mutation and writing back. This eliminates the stale-snapshot
   problem across both processes and goroutines.

2. In-process mutex in WithFileLock (pkg/fileutils/lock.go):
   flock(2) does not provide mutual exclusion between different file
   descriptors within the same process. Added a per-path sync.Mutex
   that serializes goroutines before acquiring the flock, so the
   cross-process lock remains effective for its intended purpose.

Fixes #4339

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Evict stale cache entry in DeleteSecret when key is gone from disk
- Add design comments on cache-lag windows in SetSecret/DeleteSecret
- Document GetSecret/ListSecrets intentional stale-cache behavior
- Document processLocks unbounded growth trade-off
- Use Load-first pattern in getProcessLock to avoid steady-state allocs

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@JAORMX JAORMX force-pushed the fix-encrypted-secrets-toctou branch from a3c8eb0 to beb1198 Compare March 25, 2026 05:50
@github-actions github-actions bot added size/S Small PR: 100-299 lines changed and removed size/S Small PR: 100-299 lines changed labels Mar 25, 2026
@JAORMX JAORMX merged commit 2a67719 into main Mar 25, 2026
52 checks passed
@JAORMX JAORMX deleted the fix-encrypted-secrets-toctou branch March 25, 2026 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Small PR: 100-299 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concurrent SetSecret calls silently clobber each other (TOCTOU)

2 participants