Skip to content

Write cache improvements#3647

Merged
roman-khimov merged 10 commits intomasterfrom
writecache-improvements
Oct 23, 2025
Merged

Write cache improvements#3647
roman-khimov merged 10 commits intomasterfrom
writecache-improvements

Conversation

@roman-khimov
Copy link
Member

No description provided.

0.46.0 is May 2025, half a year is enough for this transition.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Once an object got into the cache c.objCounters.Size() is not zero until it
leaves the cache, even if there are no new objects coming in. This means we're
looping here without any stops while we have objects in cache and a lot of
cycles can be made even with a very small ops/s load (multiplied by the number
of shards). This loop also makes a copy of object list on each iteration which
leads to _a lot_ of allocations and associated GC pressure.

As a minimal fix we can just iterate over the list once per 100ms, 20 threads
can add enough pressure to real HDDs for them to be busy all this time and we
won't be wasting cycles for nothing.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
1. Make it a separate active (timer!) entity.
2. Simplify in-transit checks by maintaining a single map of items, don't
   allocate additional maps.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Signed-off-by: Roman Khimov <roman@nspcc.ru>
Signed-off-by: Roman Khimov <roman@nspcc.ru>
…3414

This allows to have multiple batches flushing at the same time similar to
multiple objects.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
…Objs

Adding into flushObjs _after_ object submission to worker thread is inherently
wrong, that thread can flush (and attempt to delete from flushObjs) the object
before it's added into flushObjs.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Doing it per-address is excessive, we will do it anyway for big objects
(since we're waiting on send anyway) and small ones just manage the address
slice (quickly).

Signed-off-by: Roman Khimov <roman@nspcc.ru>
It's not needed.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
1. Sleep more, enough for the new batch to appear on its own (wrt time limit),
   ticker works perfectly for this.
2. Once the time has passed we're always getting a set of current objects and
   sort them by size.
3. This means we already have natural batches and we don't even need to
   allocate slices for them. Timer management becomes irrelevant as well.
4. Push batches into threads and be happy, just don't forget about big objects
   (subslice of one) and flushing the last object from the set.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
@roman-khimov roman-khimov force-pushed the writecache-improvements branch from 10976b8 to 4e2fb40 Compare October 22, 2025 15:26
@codecov
Copy link

codecov bot commented Oct 22, 2025

Codecov Report

❌ Patch coverage is 72.34043% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 26.91%. Comparing base (86161a2) to head (4e2fb40).
⚠️ Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
pkg/local_object_storage/writecache/flush.go 75.00% 19 Missing and 3 partials ⚠️
internal/testutil/log.go 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3647      +/-   ##
==========================================
- Coverage   26.97%   26.91%   -0.06%     
==========================================
  Files         656      656              
  Lines       51273    51180      -93     
==========================================
- Hits        13831    13777      -54     
+ Misses      36380    36350      -30     
+ Partials     1062     1053       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

// addObject add an object to batch if batch is active, returns success
// (added or not) and batch completion status (true if this batch no longer
// accepts objects).
func (b *batch) addObject(a oid.Address, s uint64) (bool, bool) {
Copy link
Member

@carpawell carpawell Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s uint64 does not explain much, nor func comment does

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter much, it's gone. But address/size pair is very much natural for things we do here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's gone

was not clear to me when i reviewed one by one. mb, that should not be changed at all then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an important step in untangling the code, so I left it as is.

b.addrs = append(b.addrs, a)
b.size += s
if b.size >= b.c.maxFlushBatchSize || len(b.addrs) > b.c.maxFlushBatchCount {
go b.flush()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simple uncontrollable go?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's gone as well.

- Move `fee.main_chain` into `mainnet.extra_fee` in IR config (#3619)
- Move `contracts` into `mainnet.contracts` in IR config (#3619)
- Move `persistent_sessions` data to `persistent_state` in SN (#3630)
- More efficient write cache batching for small objects (#3414)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- More efficient write cache batching for small objects (#3414)
- More efficient write cache batching for small objects (#3647)

I'm a little confused. Are we inserting the pr where these changes were made?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it's consistent, we have a mix of PRs and issues.

@roman-khimov roman-khimov merged commit 3404e20 into master Oct 23, 2025
22 checks passed
@roman-khimov roman-khimov deleted the writecache-improvements branch October 23, 2025 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants