[RFC] CChain Concurrency Improvement (Base + Tail Architecture) by alexanderwiederin · Pull Request #34424 · bitcoin/bitcoin

alexanderwiederin · 2026-01-27T20:44:43Z

RFC: CChain Concurrency Improvement

This PR implements a copy-on-write based CChain design to enable concurrent access.

Motivation

Current CChain uses a single vector protected by cs_main, creating bottlenecks for multi-threaded access. This PR lays the groundwork for removing cs_main locks by enabling consistent snapshots without holding locks.

Design

See full design document: CChain Concurrency (Out of date after mutex removal - will be updated soon)

TL;DR: Split chain into base (bulk of history) + tail (recent ~1,000 blocks). Use nested stlab::copy_on_write so updates copy only tail, keeping base shared.

Thanks to @purpleKarrot for pointing me to stlab::copy_on_write and making design suggestions.

Changes

Implements copy-on-write based CChain with value semantics
Removes cs_main locks from RPC methods (where possible)
All validation locks remain in place (cs_main still held during validation)
API unchanged - backward compatible

Request for Comments

Is the nested copy-on-write architecture sound?
Is MAX_TAIL_SIZE = 1000 reasonable?
What are the concerns for future lock removal?
Is the added complexity worth the benefit?

DrahtBot · 2026-01-27T20:44:53Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34424.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	ismaelsadeeq

If your review is incorrectly listed, please copy-paste  into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#34440 (Refactor CChain methods to use references, tests by optout21)
#32427 ((RFC) kernel: Replace leveldb-based BlockTreeDB with flat-file based store by sedited)
#29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

required -> requires [Subject-verb agreement: "Concurrent access ... required external synchronization." should read "Concurrent access ... requires external synchronization." to be grammatically correct.]

^{2026-01-30 18:32:27}

ismaelsadeeq

Concept ACK on this.

Looks simpler than I thought.

A few comments after a quick skim through the PR

src/validation.h

ismaelsadeeq · 2026-01-28T13:52:31Z

src/util/copy_on_write.h

@@ -0,0 +1,260 @@
+// Copyright (c) 2013 Adobe
+// Copyright (c) 2025-present The Bitcoin Core developers


In "util: add copy-on-write smart pointer" 490324d

Is this a direct copy? Or has modifications, perhaps add a link in the commit message to the reference implementation.

I also think we should have a basic unit test for this utility.

It's practically a direct copy. I removed the comments, documentation and deprecated fields and methods to keep it minimal. I also left a link to https://github.com/stlab/stlab-copy-on-write on the file header and just added Original source: https://github.com/stlab/stlab-copy-on-write/tree/abb4445 to the commit message.

Agree on the unit tests for this!

src/chain.cpp

ismaelsadeeq · 2026-01-28T14:00:49Z

src/validation.h

    //! @{
    Chainstate& ActiveChainstate() const;
    CChain& ActiveChain() const EXCLUSIVE_LOCKS_REQUIRED(GetMutex()) { return ActiveChainstate().m_chain; }
+    CChain ActiveChainSnapshot() const {


In "validation: add ActiveChainSnapshot() method" 6b4db96

It reduces lock contention, indeed, but one downside is that it can be inconvenient for someone who just wants to know, e.g., the chain tip to get height or hash in a concurrent way. They have to copy the whole chain.

We should also support copying the chain tip, and returning the latest block index in the chain, not copying the whole chain.

Of course, they can use the old ActiveChain, lock and make a copy, but in my opinion, we should strive to make the locking of cs_main internal to ChainstateManager and simply return copies of these values. Future code should only handle locks directly in performance-critical paths where making a copy is too expensive, or in synchronous contexts where the caller already holds the lock.

For this RFC we chose to do a limited feature set. I agree that the end goal is something like you are describing.

@ismaelsadeeq let me think about the best approach here. Thanks for the feedback.

@ismaelsadeeq In the last push I refactored ActiveTip and ActiveHeight to use the snapshots and deduce the values from there. To reiterate, taking a copy of the chain is cheap due to copy-on-write - the underlying data is reference-counted. ActiveTip and ActiveHeight no longer require holding cs_main.

I've also removed the write_mutex on the CChain class in my last push to lean more into value semantics. The idea is that we treat the chain similar to how we would treat a std::vector. It's not internally thread-safe, and we expect the owner to synchronise.

I've also added documentation to CChain describing the performance characteristics and copy-on-write behaviour.

darosior · 2026-01-28T15:35:51Z

Design

See full design document: CChain Concurrency

In the design section you describe the initial state with a single block in the tail, but a merge operation empties it. It seems nicer for those to be consistent?

Is MAX_TAIL_SIZE = 1000 reasonable?

I guess you could go an order of magnitude lower and match the coinbase maturity, if we need to rewrite the chain this deep we have bigger problems than a large amount of copies?

Introduce copy-on-write wrapper from stlab library to enable efficient value semantics with shared ownership. Uses atomic reference counting to share instances and only copies on modification when non-unique. Based on Adobe's stlab library, distributed under Boost Software License 1.0 Original source: https://github.com/stlab/stlab-copy-on-write/tree/abb4445 Co-authored-by: janb84 <608446+janb84@users.noreply.github.com>

alexanderwiederin · 2026-01-29T11:18:45Z

Appreciate the feedback @darosior!

Is MAX_TAIL_SIZE = 1000 reasonable?

I guess you could go an order of magnitude lower and match the coinbase maturity, if we need to rewrite the chain this deep we have bigger problems than a large amount of copies?

I think the trade-off for MAX_TAIL_SIZE is primarily about performance during normal sequential operation rather than rewrites.

Assuming the tail is copied on every AppendToTail and the base is copied on every MergeTailIntoBase (the conservative case with copy-on-write), the question becomes: should we optimise for cheap AppendToTail operations (smaller tail) or infrequent MergeTailIntoBase operations (larger tail)?

The key insight is that MergeTailIntoBase copies the entire base, so its cost scales with chain length. Therefore:

Longer chains benefit from larger MAX_TAIL_SIZE (fewer expensive base copies)
Shorter chains benefit from smaller MAX_TAIL_SIZE (cheaper tail copies)

So far, the change has only resulted in a negligible IBD performance reduction, but I think we should pay attention to this parameter.

sipa · 2026-01-30T16:11:31Z

Is there actually a need for stable CChain objects that live for non-trivial amounts of time?

If not, it seems a simpler alternative might be to just give CChain its own internal non-exposed mutex - allowing access to it without holding cs_main?

Refactor CChain into a regular type with copy-on-write semantics, using a split base/tail architecture to enable snapshots. Key changes: - Make CChain a regular type supporting copy construction/assignment - Introduce COW Impl struct holding base (COW vector) and tail (mutable vector) - Optimize sequential appends by accumulating in tail until MAX_TAIL_SIZE - Merge tail into base only when size threshold is reached - Handle reorgs by rebuilding base and clearing tail - Update all methods (FindFork, FindEarliestAtLeast, etx.) for split storage This allows creating chain snapshots via simple copy while sharing the bulk of the data. Copies only occur on modification. Co-authored-by: janb84 <608446+janb84@users.noreply.github.com>

Add ActiveChainSnapshot() to ChainstateManager to enable lock-free access to the active chain. Unlike ActiveChain() which requires holding cs_main, this methods acquires the lock internally and returns a copy-on-write snapshot of the chain. Refactor ActiveTip() and ActiveHeight() to use ActiveChainSnapshot() instead of ActiveChain(), removing the EXCLUSIVE_LOCKS_REQUIRED annotation. This allows these methods to be called without holding cs_main, as they now manage locking internally. This reduces lock contention by allowing RPC methods and other consumers to work with a consistent view of the chain without holding cs_main for extended periods. Co-authored-by: janb84 <608446+janb84@users.noreply.github.com>

Replace cs_main-locked chain access with ActiveChainSnapshot() in getnetworkhashps to reduce lock contention.

Replace cs_main-locked chain access with ActiveChainSnapshot(), ActiveTip() and ActiveHeight() in blockchain RPC methods to reduce lock contention. These methods acquire cs_main internally and provide a consistent view of the chain without requiring the caller to hold cs_main throughout. For RPCs that also need BlockIndex lookups, cs_main is still acquired but only for the BlockIndex access, not for chain queries. Updated RPCs: - getblock, getblockheader: ActiveTip() for tip, cs_main for block lookup - getchaintips: ActiveChainSnapshot() for active chain, cs_main for BlockIndex iteration - getblockcount, getbestblockhash: ActiveHeight() only no cs_main needed - geblockhash, getdifficulty: ActiveChainSnapshot() via ParseHashOrHeight - pruneblockchain: ActiveHeight() for height calculation, cs_main for pruning - getchaintxstats: ActiveChainSnapshot() for tip and Contains check

alexanderwiederin · 2026-02-02T13:00:36Z

Is there actually a need for stable CChain objects that live for non-trivial amounts of time?

Based on anecdotal experience, I believe yes. From what I understand this could be interesting for: a) RPC users in the context of mining and b) kernel API consumers for use cases like indexing, analytics and monitoring.

But I acknowledge this needs a more concrete answer.

In the meantime, @sedited and/or @Sjors, do you want to chime in?

sipa · 2026-02-02T13:21:57Z

I believe that due to the fact that CBlockIndex::GetAncestor() is fast and usable without cs_main (or any locking), most if not all uses of CChain that need a stable view can be rewritten to just query the highest-height block they need, and then use CBlockIndex::GetAncestor() from there to get to earlier blocks.

CBlockIndex::GetAncestor() is less efficient than CChain::operator[], but I don't think the difference is sufficient to really matter for any uses.

I haven't tried this, or dug into what callers would need to change for this, so feel free to find counterexamples. But absent that, my thinking at a high level is that the CBlockIndex pprev/pskip structure effectively already allows lock-free access, and CChain is mostly an optimization that can be used when one does have access to cs_main (or whatever lock protects the chain structures). If some of those call sites need reduction in their locking, the first thing to try would be to switch them over to CBlockIndex::GetAncestor().

sedited · 2026-02-02T13:38:39Z

Re #34424 (comment)

Is there actually a need for stable CChain objects that live for non-trivial amounts of time?

There are a number of rpc calls where a stable CChain object is held for some amount of time, .e.g. getchaintips might do multiple deep lookups and traversals down to various fork points in the chain. Granted, none of these currently benefit from this change, because they currently also require a lock to access the block index, though that might be easier to workaround after this.

I see this change as a first step towards moving some other data structures into a similar pattern and cleaning up our locking annotations for the chain and block index data structures (for example making the pprev pointer const).

What might be less compelling for developers in this codebase is that the approach here arguably provides an API to the chain that is fairly easy to reason about for external, or rather kernel library, users.

sipa · 2026-02-02T13:43:25Z

I certainly have no objections to improving locking annotations, and making locking cleaner. But in this case, I'm really not convinced: if you need lock-free access, just don't use CChain. It's a precomputed index-by-height for O(1) access rather than the O(log^2 n) access that CBlockIndex::GetAncestor() provides, but at the cost of needing synchronization. If that synchronization is burdensome, just don't use it.

Specifically for getchaintips, you can get the current active tip once at the beginning, and then instead of active_chain.Contains(block), use active_tip.GetAncestor(block->nHeight) == block.

sedited · 2026-02-02T14:25:11Z

re #34424 (comment)

It's a precomputed index-by-height for O(1) access rather than the O(log^2 n) access that CBlockIndex::GetAncestor() provides, but at the cost of needing synchronization

Mmh, going by its prevalence in the current code base it seems like developers would rather have random O(1) access than forego synchronization :P

To provide a little more background, this topic was repeatedly brought up during review of the kernel api, where it was repeatedly requested to provide optimized, indexed, random access to entries in the block map without the need for synchronization. If there is no greater appetite for that here, then I guess that should be abandoned. If developers really require repeated, fast, random height queries, they can also build such an index themselves.

I do agree though that in most cases in our code a single query to the tip (that could be protected by its own lock as per your suggestion), and then a lookup to GetAncestor is probably fast enough. I'm curious how you would weigh between needing synchronization and doing a quick lookup?

sipa · 2026-02-02T14:41:55Z

Mmh, going by its prevalence in the current code base it seems like developers would rather have random O(1) access than forego synchronization :P

Maybe, or that place in the code held a lock already, so there was no reason not to use CChain.

Maybe we should investigate what would be required to drop CChain entirely. Maybe it's just fine. Maybe it needs a few tweaks to some algorithms (like sorting certain queries by height to minimize the distances fed to GetAncestor). But if that were possible, without unacceptable performance degradations, then I think it'd be useful evidence that there is no need to expose an unsynchronized CChain - or even to actually remove it internally.

sipa · 2026-02-02T15:27:35Z

There do seem to be a lot of places where CChain::Next() is called, and I'm not sure that for all of those the overhead of a tip.GetAncestor(block->nHeight + 1) would be acceptable.

Still, maybe it's possible to add an internal mutex to CChain, which is only held during its own function calls. This would allow exposing "get at given height in active chain", "get successor in main chain", "find fork point with active chain", "check if in main chain" functions, which don't need external synchronization (so no cs_main needed), but also lack stability. Anything that needs stability can use the GetAncestor alternative. It wouldn't surprise me that this is sufficient for our own use cases.

alexanderwiederin · 2026-02-03T21:57:23Z

Your suggestions make sense - since CChain usage is mostly incidental to other reasons for holding cs_main, optimising CChain operations is misplaced. Simpler alternatives like GetAncestor or an internal mutex should suffice for chain access.

I'll reconsider the approach with this in mind - thanks for taking the time!

alexanderwiederin changed the title ~~[RFC] Chain Concurrency Improvement~~ [RFC] Chain Concurrency Improvement (Base + Tail Architecture) Jan 27, 2026

alexanderwiederin changed the title ~~[RFC] Chain Concurrency Improvement (Base + Tail Architecture)~~ [RFC] CChain Concurrency Improvement (Base + Tail Architecture) Jan 27, 2026

alexanderwiederin force-pushed the chain-concurrency-enhancement branch from 94e79ea to a26ce37 Compare January 28, 2026 02:00

ismaelsadeeq reviewed Jan 28, 2026

View reviewed changes

alexanderwiederin force-pushed the chain-concurrency-enhancement branch from a26ce37 to bbbed41 Compare January 28, 2026 16:25

DrahtBot mentioned this pull request Jan 29, 2026

Refactor CChain methods to use references, tests #34440

Draft

alexanderwiederin force-pushed the chain-concurrency-enhancement branch from bbbed41 to 84beaa9 Compare January 30, 2026 16:08

alexanderwiederin and others added 4 commits January 30, 2026 19:31

rpc: use chain snapshot in mining RPC

94c2139

Replace cs_main-locked chain access with ActiveChainSnapshot() in getnetworkhashps to reduce lock contention.

alexanderwiederin force-pushed the chain-concurrency-enhancement branch from 84beaa9 to 711b3f4 Compare January 30, 2026 18:31

DrahtBot mentioned this pull request Jan 31, 2026

(RFC) kernel: Replace leveldb-based BlockTreeDB with flat-file based store #32427

Open

hodlinator mentioned this pull request Feb 2, 2026

Add test to illustrate thread safety issue if used wrongly stlab/stlab-copy-on-write#9

Open

alexanderwiederin closed this Feb 5, 2026

DrahtBot mentioned this pull request Feb 6, 2026

kernel, refactor: return error status on all fatal errors #29700

Draft

alexanderwiederin mentioned this pull request Feb 23, 2026

[RFC] BlockMap and CChain Concurrency Improvement #34654

Draft

		@@ -0,0 +1,260 @@
		// Copyright (c) 2013 Adobe
		// Copyright (c) 2025-present The Bitcoin Core developers

Conversation

alexanderwiederin commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!