Skip to content

Conversation

@l0rinc
Copy link
Contributor

@l0rinc l0rinc commented Dec 20, 2025

Problem

The current coins database read-error handling adds unnecessary complexity to the critical path:

  • CCoinsViewErrorCatcher sits between the cache and database views, wrapping every read operation with a generic error handler
  • ExecuteBackedWrapper adds another function call layer with lambda captures on each database read
  • Uses std::vector<std::function<void()>> for callbacks when there's only ever one callback registered
  • The error callback is set after database initialization rather than during construction (requiring a cs_main lock)

This adds unnecessary stack depth and bookkeeping overhead on every coins database read operation (GetCoin). Also note that CCoinsViewDB::HaveCoin isn't even called from production code as far as I can tell.

Context

This is part of ongoing coins caching cleanup efforts like #34124, #33866, #33018, and #33512.

Fix

This PR collapses the CCoinsViewErrorCatcher layer entirely and moves error handling directly into CCoinsViewDB:

  • Remove ExecuteBackedWrapper and CCoinsViewErrorCatcher, placing try-catch blocks directly in CCoinsViewDB::GetCoin and CCoinsViewDB::HaveCoin
  • Replace std::vector<std::function<void()>> with a single std::function<void()> member, initialized in the constructor
  • The read error callback is now passed to Chainstate::InitCoinsDB and forwarded to CCoinsViewDB constructor

A new test is also added to reproduce the behavior in case of database corruption. It's added as a first commit to demonstrate that the behavior stays unchanged.

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 20, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34132.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK sedited
Stale ACK andrewtoth

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34276 (Remove empty caption from user interface (noui, gui) by maflcko)
  • #34207 (coins/refactor: enforce GetCoin() returns only unspent coins by l0rinc)
  • #34164 (validation: add reusable coins view for ConnectBlock by andrewtoth)
  • #34124 (refactor: make CCoinsView a purely virtual abstract base class by l0rinc)
  • #31382 (kernel: Flush in ChainstateManager destructor by sedited)
  • #31132 (validation: fetch block inputs on parallel threads 3x faster IBD by andrewtoth)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@l0rinc
Copy link
Contributor Author

l0rinc commented Dec 23, 2025

close/open dance to retrigger CI - likely caused by GitHub outage

@l0rinc l0rinc closed this Dec 23, 2025
@l0rinc l0rinc reopened this Dec 23, 2025
@l0rinc l0rinc marked this pull request as ready for review December 23, 2025 09:27
Copy link
Contributor

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I think getting rid of this indirection is great. It reduces a lot of complexity when thinking about the stack of views.

When reviewing, I think the last 3 commits should be combined into 1. There doesn't seem to be much value in creating and then removing functions in multiple commits.

I would utACK but I wanted clarity on why we have to skip a test when no IPC.

@sedited
Copy link
Contributor

sedited commented Jan 5, 2026

Concept ACK

@l0rinc
Copy link
Contributor Author

l0rinc commented Jan 11, 2026

Rebased and updated the added test to work on 32 bit systems as well. Ready for review again!

@l0rinc l0rinc force-pushed the l0rinc/move-errorcatcher branch from dceb671 to 4ae1f79 Compare January 11, 2026 17:07
Copy link
Contributor

@andrewtoth andrewtoth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 4ae1f79

@DrahtBot DrahtBot requested a review from sedited January 11, 2026 19:40
@l0rinc l0rinc force-pushed the l0rinc/move-errorcatcher branch 2 times, most recently from d306374 to 36a01d0 Compare January 12, 2026 17:19
l0rinc and others added 5 commits January 12, 2026 19:47
The error was also reworded to avoid the leading repetition (see expected stderr).

### Reproducers
The test failure can be triggered manually by:
```patch
std::optional<Coin> CCoinsViewErrorCatcher::GetCoin(const COutPoint& outpoint) const
{
-    return ExecuteBackedWrapper<std::optional<Coin>>([&]() { return CCoinsViewBacked::GetCoin(outpoint); }, m_err_callbacks);
+    return CCoinsViewBacked::GetCoin(outpoint);
}
```

### Notes
Note that CCoinsViewErrorCatcher::HaveCoin failure isn't tested since it doesn't seem to be used in production code, all tests pass with:
```patch
bool CCoinsViewErrorCatcher::HaveCoin(const COutPoint& outpoint) const
{
-    return ExecuteBackedWrapper<bool>([&]() { return CCoinsViewBacked::HaveCoin(outpoint); }, m_err_callbacks);
+    throw "CCoinsViewBacked::HaveCoin";
}
```

### Context:

On 64-bit systems, LevelDB uses mmap which reflects file changes immediately.
On 32-bit systems (i686, armhf), mmap is disabled (kDefaultMmapLimit=0 in leveldb/util/env_posix.cc), so LevelDB uses block cache which serves stale data.
The node must be restarted after corruption to clear the cache.

Corruption is placed at the middle to avoid LevelDB's paranoid_checks verification during database open, and -checkblocks=0 -checklevel=0 skips Bitcoin's block verification.

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
Inlined `ExecuteBackedWrapper`, removing a lambda indirection and specializing the error logs per usage.

The error handling will be pushed down to the call sites in follow ups.
There's only ever a single callback at most, let's assign it instead of pretending we have a list of callbacks. This will be moved to the constructor in the next commit.
@l0rinc l0rinc force-pushed the l0rinc/move-errorcatcher branch from 36a01d0 to 0d0743d Compare January 12, 2026 18:48
Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left two more comments.

int64_t check_blocks{DEFAULT_CHECKBLOCKS};
int64_t check_level{DEFAULT_CHECKLEVEL};
std::function<void()> coins_error_cb;
std::function<void()> read_error_cb{[]{}};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e77ef5c: I don't think this is a rename. You change nullptr to an object?! When changing behavior, you should mention why the behavior is changed, instead of omitting the change.

Comment on lines -375 to -378
// Starting the shutdown sequence and returning false to the caller would be
// interpreted as 'entry not found' (as opposed to unable to read data), and
// could lead to invalid interpretation. Just exit immediately, as we can't
// continue anyway, and all writes should be atomic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

029224b: I don't think this is an "inline". You are removing this comment without even mentioning it. When removing code, you should mention why it is not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ExecuteBackedWrapper can be kept. It seems applicable if it is moved to txdb.cpp as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants