-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
feat(config, model, scanner): add scan progress file limit #10221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
syncthing#9927) (syncthing#10179) I'll let Audrius words from the ticket explain this :) > I'm a bit lost, time.Duration is an int64, yet watcher delay is float, > anything sub 1s gets rounded down to 0, so you just end up going into an > infinite loop. syncthing#9927 (comment)
…ixes syncthing#9879) (syncthing#10176) Only Require either matching UID & GID OR matching Names. If the 2 devices have a different Name => UID mapping, they can never be totaly equal. Therefore when syncing we try matching the Name and fall back to the UID. However when scanning for changes we currently require both the Name & UID to match. This leads to forever having out of sync files back and forth, or local additions when receive only. This patch does not change the sending behavoir. It only change what we decide is equal for exisiting files with mismapped Name => UID, The added testcases show the change: Test 1,5,6 are the same as current. Test 2,3 Are what change with this patch (from false to true). Test 4 is a subset of test 2 they is currently special cased as true, which does not chnage. Co-authored-by: Jakob Borg <jakob@kastelo.net>
The name is super long, how about something like Also, just a comment, but if a large number of files is always a problem here, maybe enable this by default, e.g. for 1,000,000 or more files? Are you able to say from which number of files this actually starts to become an issue? |
|
https://forum.syncthing.net/t/main-is-now-v2/24456?u=marbens:
|
|
We already have a knob to disable this, for approximately this reason. Would it make sense to just use that instead? |
That's true, but it may not be that obvious. The docs say it can be set to |
|
All fixed up, including in the docs. This setting is just a way to automatically provide sane defaults and avoid manual tweaking of the configuration on larger setups. |
|
Another option could be to spill the queue to the database when it exceeds a trivial size, and remove all the knobs. |
Co-authored-by: tomasz1986 <twilczynski@mailbox.org>
|
Actually, now that I think of it offloading to the db might be a better idea, because we started encountering another performance issue in our 1.x HDD-based setup with scanProgressIntervalS=-1, hasher=1 caused by concurrent filesystem access (the hasher hashes the file while the walker keeps scanning the directory entries): I've considered the following options:
However, offloading to the database could solve this issue in a simpler way, without increasing memory usage (this might introduce some concurrent disk usage if the database is stored on the same HDD, but in our setup it doesn't have to be): though I'm not too sure how to approach db offloading, any pointers? :) |
Purpose
When syncing folders with a huge number of files (millions) and the
scanProgressIntervalSsetting is>= 0, the entire file list will be buffered in RAM: this process slows down syncthing to a crawl in our setup due to GC thrashing, caused by the excessive memory pressure caused by the continuous reallocation of thefilesToHasharray while gathering the file list.This PR adds a new setting to disable scan progress events if more than
scanProgressEnableIfFilesLessThanfiles are encountered.Testing
Added a new test to cover the config.
Documentation
Will submit a PR to the docs repo with the same changes contained here once approved