-
Notifications
You must be signed in to change notification settings - Fork 594
feat(general): reduce number of open file descriptors for indexes on Unix #4866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
trying to solve my own issue #4831 . works ok on my machine locally so I figured I'd push it up if it would help anyone else. |
|
please rebase and fix merge conflict, we should test this on Mac and Linux and then we can merge |
8a2f7bd to
f682b7d
Compare
julio-lopez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for proposing the changes, they are pointing in the right direction.
Please take a look at the inline comments.
Thanks again,
| // Do not run in parallel to avoid fd count noise from other tests. | ||
| ctx := testlogging.Context(t) | ||
| ft := faketime.NewClockTimeWithOffset(0) | ||
| cache := &diskCommittedContentIndexCache{testutil.TempDirectory(t), ft.NowFunc(), func() int { return 3 }, testlogging.Printf(t.Logf, ""), DefaultIndexCacheSweepAge} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samm81 This failed to compile in a previous CI job and it is going to fail again due to a type mismatch for the "log" type.
See https://github.com/kopia/kopia/actions/runs/18611420883/job/53293937511?pr=4866
| if delta := after - before; delta > maxDelta { | ||
| t.Fatalf("fd count grew too much after opening %d indexes: before=%d after=%d delta=%d (max allowed %d)", N, before, after, delta, maxDelta) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be simplified as:
| if delta := after - before; delta > maxDelta { | |
| t.Fatalf("fd count grew too much after opening %d indexes: before=%d after=%d delta=%d (max allowed %d)", N, before, after, delta, maxDelta) | |
| } | |
| require.LessOrEqualf(t, after - before, maxDelta "fd count grew too much after opening %d indexes:", N) |
The delta (that is after - before) and maxDelta will be already included in the message when the assertion fails.
| ft := faketime.NewClockTimeWithOffset(0) | ||
| cache := &diskCommittedContentIndexCache{testutil.TempDirectory(t), ft.NowFunc(), func() int { return 3 }, testlogging.Printf(t.Logf, ""), DefaultIndexCacheSweepAge} | ||
|
|
||
| const N = 200 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| const N = 200 | |
| const indexCount = 200 |
On Unix-like platforms, close the file descriptor immediately after a successful
mmap.Map of .sndx index cache files. This keeps the mapping valid until Unmap
(per POSIX semantics) and significantly reduces steady-state FD usage when many
indexes are open, helping avoid EMFILE ("too many open files").
- Split mmapOpenWithRetry into platform-specific implementations:
- committed_content_index_disk_cache_unix.go (!windows):
- Map RDONLY, close FD immediately.
- Return closer that only unmaps.
- committed_content_index_disk_cache_windows.go (windows):
- Keep FD open until Unmap.
- Return closer that unmaps and closes FD.
- Remove old mmapOpenWithRetry and mmap import from
repo/content/committed_content_index_disk_cache.go.
- Add Linux-only unit test verifying FD count does not grow proportionally:
repo/content/committed_content_index_fd_linux_test.go
- Creates N small indexes, opens them all, checks /proc/self/fd delta stays low.
Notes:
- Behavior unchanged on Windows due to OS semantics.
- Mapping failures close the FD to avoid leaks.
- Unlink semantics remain correct; mappings stay valid until Unmap.
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
4999af2 to
6d5ae8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes file descriptor usage on Unix platforms by closing FDs immediately after memory-mapping index files, while preserving the existing behavior on Windows due to OS-specific requirements.
Key changes:
- Splits
mmapOpenWithRetryinto platform-specific implementations with Unix closing FDs post-mmap and Windows retaining them - Removes the original cross-platform implementation and mmap import from the base file
- Adds a Linux-specific test verifying FD count remains low when many indexes are opened
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| repo/content/committed_content_index_disk_cache.go | Removes original mmapOpenWithRetry implementation and mmap-go import |
| repo/content/committed_content_index_disk_cache_unix.go | New Unix implementation that closes FD immediately after successful mmap |
| repo/content/committed_content_index_disk_cache_windows.go | New Windows implementation preserving retry logic and keeping FD open |
| repo/content/committed_content_index_fd_linux_test.go | Adds test verifying FD count doesn't grow proportionally when opening many indexes |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| func (c *diskCommittedContentIndexCache) mmapOpenWithRetry(_ context.Context, path string) (mmap.MMap, func() error, error) { | ||
| f, err := os.Open(path) //nolint:gosec | ||
| if err != nil { | ||
| return nil, nil, errors.Wrap(err, "unable to open file despite retries") |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message 'unable to open file despite retries' is misleading on Unix since no retry logic exists in this implementation. Consider changing to a simpler message like 'unable to open file' or 'error opening file for mmap'.
| return nil, nil, errors.Wrap(err, "unable to open file despite retries") | |
| return nil, nil, errors.Wrap(err, "unable to open file") |
| retryCount := 0 | ||
| for err != nil && retryCount < maxRetries { | ||
| retryCount++ | ||
| contentlog.Log2(ctx, c.log, "retry unable to mmap.Open()", |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Corrected spacing in log message from 'retry unable to mmap.Open()' to 'retry: unable to mmap.Open()' for better readability.
| contentlog.Log2(ctx, c.log, "retry unable to mmap.Open()", | |
| contentlog.Log2(ctx, c.log, "retry: unable to mmap.Open()", |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4866 +/- ##
==========================================
+ Coverage 75.86% 76.71% +0.84%
==========================================
Files 470 537 +67
Lines 37301 40968 +3667
==========================================
+ Hits 28299 31427 +3128
- Misses 7071 7494 +423
- Partials 1931 2047 +116 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@samm81 Thanks for your digging into this issue and for the corresponding improvement. |
Close file descriptors after
mmapon Unix to reduce the total number of open file descriptors.On Unix-like platforms, close the file descriptor immediately after a successful
mmap.Mapof .sndx index cache files. This keeps the mapping valid until Unmap (per POSIX semantics) and significantly reduces steady-state FD usage when many indexes are open, helping avoidEMFILE("too many open files").mmapOpenWithRetryinto platform-specific implementations:committed_content_index_disk_cache_unix.go(!windows):RDONLY, close FD immediately.committed_content_index_disk_cache_windows.go(windows):Unmap.mmapOpenWithRetryandmmapimport fromrepo/content/committed_content_index_disk_cache.gorepo/content/committed_content_index_fd_linux_test.go/proc/self/fddelta stays low.Notes: