feat(config, model, scanner): add scan progress file limit #10221

danog · 2025-07-29T13:25:56Z

Purpose

When syncing folders with a huge number of files (millions) and the scanProgressIntervalS setting is >= 0, the entire file list will be buffered in RAM: this process slows down syncthing to a crawl in our setup due to GC thrashing, caused by the excessive memory pressure caused by the continuous reallocation of the filesToHash array while gathering the file list.

This PR adds a new setting to disable scan progress events if more than scanProgressEnableIfFilesLessThan files are encountered.

Testing

Added a new test to cover the config.

Documentation

Will submit a PR to the docs repo with the same changes contained here once approved

syncthing#9927) (syncthing#10179) I'll let Audrius words from the ticket explain this :) > I'm a bit lost, time.Duration is an int64, yet watcher delay is float, > anything sub 1s gets rounded down to 0, so you just end up going into an > infinite loop. syncthing#9927 (comment)

…ixes syncthing#9879) (syncthing#10176) Only Require either matching UID & GID OR matching Names. If the 2 devices have a different Name => UID mapping, they can never be totaly equal. Therefore when syncing we try matching the Name and fall back to the UID. However when scanning for changes we currently require both the Name & UID to match. This leads to forever having out of sync files back and forth, or local additions when receive only. This patch does not change the sending behavoir. It only change what we decide is equal for exisiting files with mismapped Name => UID, The added testcases show the change: Test 1,5,6 are the same as current. Test 2,3 Are what change with this patch (from false to true). Test 4 is a subset of test 2 they is currently special cased as true, which does not chnage. Co-authored-by: Jakob Borg <jakob@kastelo.net>

man/syncthing-config.5

tomasz1986 · 2025-07-29T14:12:24Z

scanProgressEnableIfFilesLessThan

The name is super long, how about something like scanProgressFileLimit?

Also, just a comment, but if a large number of files is always a problem here, maybe enable this by default, e.g. for 1,000,000 or more files? Are you able to say from which number of files this actually starts to become an issue?

marbens-arch · 2025-07-29T14:17:27Z

https://forum.syncthing.net/t/main-is-now-v2/24456?u=marbens:

Do not target any PRs at v1.

calmh · 2025-07-29T22:10:05Z

We already have a knob to disable this, for approximately this reason. Would it make sense to just use that instead?

rasa · 2025-07-29T23:28:15Z

We already have a knob to disable this, for approximately this reason. Would it make sense to just use that instead?

That's true, but it may not be that obvious. The docs say it can be set to -1 to disable, but that fact is only mentioned in the "Performance tuning" section, not in the section documenting the setting.

danog · 2025-07-30T09:44:11Z

All fixed up, including in the docs.

This setting is just a way to automatically provide sane defaults and avoid manual tweaking of the configuration on larger setups.

calmh · 2025-07-30T13:08:42Z

Another option could be to spill the queue to the database when it exceeds a trivial size, and remove all the knobs.

lib/scanner/walk.go

Co-authored-by: tomasz1986 <twilczynski@mailbox.org>

danog · 2025-08-01T10:03:11Z

Actually, now that I think of it offloading to the db might be a better idea, because we started encountering another performance issue in our 1.x HDD-based setup with scanProgressIntervalS=-1, hasher=1 caused by concurrent filesystem access (the hasher hashes the file while the walker keeps scanning the directory entries): I've considered the following options:

Backpressure, optionally pausing scanning on HDD setups with a config until the hasher is done working on the current file
A global filesystem lock (again, optional with a config) to ensure the filesystem is never accessed concurrently on HDD setups

However, offloading to the database could solve this issue in a simpler way, without increasing memory usage (this might introduce some concurrent disk usage if the database is stored on the same HDD, but in our setup it doesn't have to be): though I'm not too sure how to approach db offloading, any pointers? :)

Per syncthing/syncthing#10221 (comment)

imsodin and others added 8 commits June 15, 2025 10:29

build: import release workflow changes from main

9703dd9

build: fix detection of next rc version

0945304

Allow disabling scan progress events if there are too many files

2e5059f

Add docs

6f3ecab

Cleanup

3fd097f

Add test

8d001f3

danog marked this pull request as ready for review July 29, 2025 13:26

rasa reviewed Jul 29, 2025

View reviewed changes

man/syncthing-config.5 Outdated Show resolved Hide resolved

tomasz1986 changed the title ~~Scan progress file limit~~ feat(config, model, scanner): add scan progress file limit Jul 29, 2025

github-actions bot added the enhancement New features or improvements of some kind, as opposed to a problem (bug) label Jul 29, 2025

marbens-arch changed the base branch from v1 to main July 29, 2025 14:14

marbens-arch added 2 commits July 29, 2025 09:37

Merge branch 'main' into scan_progress_file_limit

5839cb5

gofumpt

f76ad20

danog added 7 commits July 30, 2025 10:09

Rename

1d08b4e

Revert autogenerated docs

9f34aab

Small improvement

09f7d07

Add benchmark

7466fd6

Add extra benchmarks

a2f9bdb

Tweak default

44f4e55

Add explanation comment

69700e3

danog mentioned this pull request Jul 30, 2025

feat(config): add scanProgressFileLimit and improve scanProgressIntervalS syncthing/docs#946

Open

danog mentioned this pull request Jul 30, 2025

chore(scanner): reduce memory pressure by using pools inside hasher #10222

Merged

tomasz1986 reviewed Jul 30, 2025

View reviewed changes

lib/scanner/walk.go Outdated Show resolved Hide resolved

Update lib/scanner/walk.go

519323b

Co-authored-by: tomasz1986 <twilczynski@mailbox.org>

rasa mentioned this pull request Aug 1, 2025

fix(config): add -1 mention for scanProgressIntervalS syncthing/docs#947

Merged

calmh pushed a commit to syncthing/docs that referenced this pull request Aug 1, 2025

fix(config): add -1 mention for scanProgressIntervalS (#947)

9b61f62

Per syncthing/syncthing#10221 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(config, model, scanner): add scan progress file limit #10221

feat(config, model, scanner): add scan progress file limit #10221

Uh oh!

danog commented Jul 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

tomasz1986 commented Jul 29, 2025

Uh oh!

marbens-arch commented Jul 29, 2025

Uh oh!

calmh commented Jul 29, 2025

Uh oh!

rasa commented Jul 29, 2025

Uh oh!

danog commented Jul 30, 2025

Uh oh!

calmh commented Jul 30, 2025

Uh oh!

Uh oh!

danog commented Aug 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

feat(config, model, scanner): add scan progress file limit #10221

Are you sure you want to change the base?

feat(config, model, scanner): add scan progress file limit #10221

Uh oh!

Conversation

danog commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Testing

Documentation

Uh oh!

Uh oh!

tomasz1986 commented Jul 29, 2025

Uh oh!

marbens-arch commented Jul 29, 2025

Uh oh!

calmh commented Jul 29, 2025

Uh oh!

rasa commented Jul 29, 2025

Uh oh!

danog commented Jul 30, 2025

Uh oh!

calmh commented Jul 30, 2025

Uh oh!

Uh oh!

danog commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

danog commented Jul 29, 2025 •

edited

Loading

danog commented Aug 1, 2025 •

edited

Loading