Skip to content

Conversation

@danog
Copy link
Contributor

@danog danog commented Jul 29, 2025

Purpose

When syncing folders with a huge number of files (millions) and the scanProgressIntervalS setting is >= 0, the entire file list will be buffered in RAM: this process slows down syncthing to a crawl in our setup due to GC thrashing, caused by the excessive memory pressure caused by the continuous reallocation of the filesToHash array while gathering the file list.

This PR adds a new setting to disable scan progress events if more than scanProgressEnableIfFilesLessThan files are encountered.

Testing

Added a new test to cover the config.

Documentation

Will submit a PR to the docs repo with the same changes contained here once approved

imsodin and others added 8 commits June 15, 2025 10:29
syncthing#9927) (syncthing#10179)

I'll let Audrius words from the ticket explain this :)

> I'm a bit lost, time.Duration is an int64, yet watcher delay is float,
> anything sub 1s gets rounded down to 0, so you just end up going into
an
> infinite loop.


syncthing#9927 (comment)
…ixes syncthing#9879) (syncthing#10176)

Only Require either matching UID & GID OR matching Names.

If the 2 devices have a different Name => UID mapping, they can never be
totaly equal. Therefore when syncing we try matching the Name and fall
back to the UID. However when scanning for changes we currently require
both the Name & UID to match. This leads to forever having out of sync
files back and forth, or local additions when receive only.

This patch does not change the sending behavoir. It only change what we
decide is equal for exisiting files with mismapped Name => UID,

The added testcases show the change: Test 1,5,6 are the same as current.
Test 2,3 Are what change with this patch (from false to true). Test 4 is
a subset of test 2 they is currently special cased as true, which does
not chnage.

Co-authored-by: Jakob Borg <jakob@kastelo.net>
@danog danog marked this pull request as ready for review July 29, 2025 13:26
@tomasz1986
Copy link
Member

scanProgressEnableIfFilesLessThan

The name is super long, how about something like scanProgressFileLimit?

Also, just a comment, but if a large number of files is always a problem here, maybe enable this by default, e.g. for 1,000,000 or more files? Are you able to say from which number of files this actually starts to become an issue?

@tomasz1986 tomasz1986 changed the title Scan progress file limit feat(config, model, scanner): add scan progress file limit Jul 29, 2025
@github-actions github-actions bot added the enhancement New features or improvements of some kind, as opposed to a problem (bug) label Jul 29, 2025
@marbens-arch marbens-arch changed the base branch from v1 to main July 29, 2025 14:14
@marbens-arch
Copy link
Member

https://forum.syncthing.net/t/main-is-now-v2/24456?u=marbens:

Do not target any PRs at v1.

@calmh
Copy link
Member

calmh commented Jul 29, 2025

We already have a knob to disable this, for approximately this reason. Would it make sense to just use that instead?

@rasa
Copy link
Member

rasa commented Jul 29, 2025

We already have a knob to disable this, for approximately this reason. Would it make sense to just use that instead?

That's true, but it may not be that obvious. The docs say it can be set to -1 to disable, but that fact is only mentioned in the "Performance tuning" section, not in the section documenting the setting.

@danog
Copy link
Contributor Author

danog commented Jul 30, 2025

All fixed up, including in the docs.

This setting is just a way to automatically provide sane defaults and avoid manual tweaking of the configuration on larger setups.

@calmh
Copy link
Member

calmh commented Jul 30, 2025

Another option could be to spill the queue to the database when it exceeds a trivial size, and remove all the knobs.

Co-authored-by: tomasz1986 <twilczynski@mailbox.org>
@danog
Copy link
Contributor Author

danog commented Aug 1, 2025

Actually, now that I think of it offloading to the db might be a better idea, because we started encountering another performance issue in our 1.x HDD-based setup with scanProgressIntervalS=-1, hasher=1 caused by concurrent filesystem access (the hasher hashes the file while the walker keeps scanning the directory entries): I've considered the following options:

  • Backpressure, optionally pausing scanning on HDD setups with a config until the hasher is done working on the current file
  • A global filesystem lock (again, optional with a config) to ensure the filesystem is never accessed concurrently on HDD setups

However, offloading to the database could solve this issue in a simpler way, without increasing memory usage (this might introduce some concurrent disk usage if the database is stored on the same HDD, but in our setup it doesn't have to be): though I'm not too sure how to approach db offloading, any pointers? :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New features or improvements of some kind, as opposed to a problem (bug)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants