Skip to content

Metabase fixes for tombstone handling#3741

Merged
roman-khimov merged 14 commits intomasterfrom
meta-ts-fixes
Dec 18, 2025
Merged

Metabase fixes for tombstone handling#3741
roman-khimov merged 14 commits intomasterfrom
meta-ts-fixes

Conversation

@roman-khimov
Copy link
Member

A part of #3476.

Delete() method is unused since b57a23d and
other methods don't require any wrapping.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Nothing uses it now.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
@roman-khimov roman-khimov force-pushed the meta-ts-fixes branch 2 times, most recently from 087b459 to 67dc536 Compare December 17, 2025 19:59
@roman-khimov
Copy link
Member Author

This can break expirations/deletions, just about anything. Usually it's not
a problem because children are only deleted with parents, but if some forced
deletions are done this can be a problem. It's always safer to not delete
anyway.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
For the scenario to happen properly we need LINK object to be present
in metabase, but missing from real FSTree. Only this way we get a LINK
in SplitInfo, but then fail to fetch it. Tombstones or garbage marks
leave physical LINK in place and all of its metadata as well, while
dropping with force removes both physical object and its meta, so
the scenario becomes the same as in TestSplitObjectExpirationWithoutLink.

Fixes #3669.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Signed-off-by: Roman Khimov <roman@nspcc.ru>
ErrParentObject should be a nice error and shouldn't panic like this:

    panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked]
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xa1877e]

    goroutine 2534 [running]:
    testing.tRunner.func1.2({0xbf1ca0, 0x1453eb0})
            /usr/lib64/go/1.25/src/testing/testing.go:1872 +0x237
    testing.tRunner.func1()
            /usr/lib64/go/1.25/src/testing/testing.go:1875 +0x35b
    panic({0xbf1ca0?, 0x1453eb0?})
            /usr/lib64/go/1.25/src/runtime/panic.go:783 +0x132
    github.com/nspcc-dev/neofs-node/internal/errors.(*ParentObjectError).Error(0xebf260?)
            <autogenerated>:1 +0x1e
    github.com/stretchr/testify/assert.ErrorIs({0x7f05962a5620, 0xc0003f16c0}, {0xebf260, 0xc0000f03c0}, {0xebedc0, 0xc000155030}, {0x0, 0x0, 0x0})
            /home/rik/go/pkg/mod/github.com/stretchr/testify@v1.11.1/assert/assertions.go:2182 +0xf6
    github.com/stretchr/testify/require.ErrorIs({0xec3408, 0xc0003f16c0}, {0xebf260, 0xc0000f03c0}, {0xebedc0, 0xc000155030}, {0x0, 0x0, 0x0})
            /home/rik/go/pkg/mod/github.com/stretchr/testify@v1.11.1/require/require.go:372 +0xe5
    github.com/nspcc-dev/neofs-node/pkg/local_object_storage/metabase_test.assertECPartsError(0xc0003f16c0, {0xebf260, 0xc0000f03c0}, {0xc0001ad200, 0xa, 0x10})
            /home/rik/dev/neofs-node/pkg/local_object_storage/metabase/ec_test.go:717 +0x7a
    github.com/nspcc-dev/neofs-node/pkg/local_object_storage/metabase_test.testExistsEC(0xc0003f16c0)
            /home/rik/dev/neofs-node/pkg/local_object_storage/metabase/ec_test.go:486 +0x8d3
    testing.tRunner(0xc0003f16c0, 0xde6758)
            /usr/lib64/go/1.25/src/testing/testing.go:1934 +0xea
    created by testing.(*T).Run in goroutine 2491
            /usr/lib64/go/1.25/src/testing/testing.go:1997 +0x465

Signed-off-by: Roman Khimov <roman@nspcc.ru>
root is checked 20 lines above, it's always true here.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
In case LINK is missing we're using search to find objects that need to be
deleted, but this can work incorrectly for expired children since search
never returns outdated objects. I believe this fixes #3715, that's the
scenario (and log) that can be seen in UTs and it leads to the last object
part being stuck in the GC loop: parent object should be deleted, but
searching doesn't return an outdated object.

The new API is also somewhat more efficient.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Parent locked -> child locked. Parent deleted -> child deleted. Parent
expired -> child expired. This makes sense overall and this is especially
important for future updates to tombstone handling, we need some way to
derive child status without graveyard marks for all children, just using
the meta index that does have this data anyway (since TS are broadcasted)

Signed-off-by: Roman Khimov <roman@nspcc.ru>
Pre-2.18 tombstones are not supported after 45150c9,
so this code is no longer relevant.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
It's never false in production code now (the last user was meta resync code)
and tests don't care much.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
All production code uses this call for forced removals. This leaves
Engine.inhume for tombstones only.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
It's FUBAR. We better hide it till #3740 makes something more meaningful
out of it.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
The way it's intended to work is to collect all related objects, like it
does for EC. But for regular splits in only catched last/first/link,
completely missing middle children.

Signed-off-by: Roman Khimov <roman@nspcc.ru>
@codecov
Copy link

codecov bot commented Dec 17, 2025

Codecov Report

❌ Patch coverage is 71.87500% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 26.66%. Comparing base (2d2b436) to head (78dd4af).
⚠️ Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
pkg/local_object_storage/metabase/select.go 0.00% 31 Missing ⚠️
pkg/local_object_storage/engine/inhume.go 58.33% 8 Missing and 2 partials ⚠️
pkg/local_object_storage/metabase/metadata.go 76.19% 8 Missing and 2 partials ⚠️
pkg/local_object_storage/shard/select.go 0.00% 6 Missing ⚠️
pkg/local_object_storage/engine/select.go 95.74% 1 Missing and 1 partial ⚠️
pkg/local_object_storage/metabase/exists.go 95.91% 1 Missing and 1 partial ⚠️
cmd/neofs-node/object.go 0.00% 1 Missing ⚠️
pkg/local_object_storage/metabase/inhume.go 85.71% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3741      +/-   ##
==========================================
+ Coverage   26.60%   26.66%   +0.06%     
==========================================
  Files         656      656              
  Lines       41552    41611      +59     
==========================================
+ Hits        11053    11095      +42     
- Misses      29482    29503      +21     
+ Partials     1017     1013       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


_, err = e.Get(parentAddr)
require.ErrorAs(t, err, &splitErr)
require.Equal(t, linkObj.GetID(), splitErr.SplitInfo().GetLink()) // Pretends to be here, but not in fact.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tbh, do not get what this test tests, but it tests smth for sure

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the log line it checks for, it's a very specific and abnormal case, but still it can happen.

fs.AddFirstSplitObjectFilter(objectSDK.MatchStringEqual, firstID)
tmpAddr.SetObject(firstID)
children = append(children, tmpAddr)
res, err := e.collectRawWithAttribute(addr.Container(), objectSDK.FilterFirstSplitObject, firstID[:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling addr.Container() duplicated too many times for me

res = append(res, oid.NewAddress(addr.Container(), firstID))
return res
}
e.log.Warn("failed to collect objects with first ID", zap.Stringer("addrBeingInhumed", addr), zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

firstID can be added to log too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Object address should be sufficient to get other things from it.

if err == nil {
return res
}
e.log.Warn("failed to collect objects with split ID", zap.Stringer("addrBeingInhumed", addr), zap.Error(err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and splitID here


var haveUnhandledID = func() bool {
for i := range ids {
if idx[i] < len(ids[i]) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idx and ids are too similar var names for me, was confused a couple of times

// OK, we're not comparing containers, but it's all
// the same and current SDK doesn't make it easy to
// make any comparisons.
return bytes.Compare(a.Object().Marshal(), b.Object().Marshal())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even Marshal? dont mind in fact

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a lot of questions for SDK. It's absolutely ridiculous how hard it is to do some simple things.

if err != nil {
return nil, err
}
return res, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return res, nil? or even always return res, err without if

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

res can be non-nil

@roman-khimov roman-khimov merged commit c9ac523 into master Dec 18, 2025
22 checks passed
@roman-khimov roman-khimov deleted the meta-ts-fixes branch December 18, 2025 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants