Skip to content

fix(scriptengine): cache CheckSig within one EvalScript run (hot-fix on top of module/gobdk/v1.2.3)#38

Closed
oskarszoon wants to merge 2 commits into
bitcoin-sv:release/1.2.3from
oskarszoon:hotfix/checker-cache-v1.2.3
Closed

fix(scriptengine): cache CheckSig within one EvalScript run (hot-fix on top of module/gobdk/v1.2.3)#38
oskarszoon wants to merge 2 commits into
bitcoin-sv:release/1.2.3from
oskarszoon:hotfix/checker-cache-v1.2.3

Conversation

@oskarszoon

Copy link
Copy Markdown
Contributor

Summary

Adds bsv::CachingScriptChecker — a per-instance subclass of TransactionSignatureChecker that memoises CheckSig results on the (sig, pubkey, scriptCode-identity) tuple for the duration of one verify call.

core/scriptengine.cpp:473 now constructs bsv::CachingScriptChecker instead of the plain TransactionSignatureChecker for the BSV verify path.

This is a hot-fix branched off module/gobdk/v1.2.3 to unblock validators that are stalled on a specific testnet transaction (see Background). Branched off module/gobdk/v1.2.3 rather than master so it can be released as v1.2.3.1 (or merged forward) without entangling with the in-flight v1.2.4 upgrade in bsv-blockchain/teranode#839.

Background

Post-Genesis BSV permits arbitrarily large locking scripts and removes the per-script-op limit. A pathological-but-valid pattern is a long chain of identical (OP_2DUP OP_CHECKSIGVERIFY) pairs against a single sig+pubkey pushed by the scriptSig. The stack stays at [sig, pubkey] after each pair, so the script performs N+1 identical signature verifications.

Observed on testnet: block 1,451,505, tx 7bc9a3408dd0c87b835c887a0bce22c20788fc3c4b953929d4367656d80acab5 whose input spends a 490,001-byte locking script with N=245,000. The plain TransactionSignatureChecker path runs:

  • 245,001 full ECDSA verifications via libsecp256k1, and
  • 245,001 full SignatureHash computations, each of which SHA256-streams the entire 490 KB scriptCode buffer.

That drives validator nodes into multi-hour CPU pegs that present to operators as a hang in the cgo _Cfunc_ScriptEngine_VerifyScript call. Teranode reproduced the same hang against gobdk v1.2.2, v1.2.3, and v1.2.4, and against the pure-Go gobt verifier — confirming it is a property of the verifier behaviour, not a library version regression.

With this cache the first CheckSig performs one full sighash plus one ECDSA verify; the remaining ~245,000 calls hit a per-instance hashmap and return in O(ns) each. Verify time collapses from hours to a few milliseconds.

Strategy

  • scriptCode identity is tracked via its data() pointer and size(). Within a single EvalScript run on a script that does not contain OP_CODESEPARATOR (effectively all modern BSV scripts) the scriptCode reference is stable, so pointer+length equality is sufficient and avoids the cost of hashing the scriptCode for the cache key.
  • sig and pubkey are reduced to 64-bit FNV-1a hashes. A collision can only affect a tuple whose first occurrence was already verified successfully against the same scriptCode and is therefore txn-scoped and benign.
  • On OP_CODESEPARATOR (scriptCode pointer/length changes) the cache is cleared and the new scriptCode is recorded — correctness preserved, cache thrown away.

Why not wire CachingTransactionSignatureChecker from script/sigcache.cpp?

Investigated; rejected for three reasons:

  1. Wrong cache layer. SV Node's CachingTransactionSignatureChecker::VerifySignature is called by TransactionSignatureChecker::CheckSig after the full SignatureHash has already been computed (interpreter.cpp:2151). The 245,001 × 490 KB SHA256 still happens. Net saving for this tx would be ~12 s (the ECDSA work), still leaving an 85–170 s grind.
  2. Build-list blast radius. sigcache.cpp is excluded from cmake/modules/FindBSVSourceHelper.cmake's _minimal_src_files. Wiring it pulls in gArgs from util.cpp, which is in the application-only list. Adding util.cpp to the minimal set cascades into logging, fs, support/* and others.
  3. Global state. script/sigcache.cpp:33 defines signatureCache as a file-static singleton initialised by InitSignatureCache() that reads -maxsigcachesize via gArgs. Cannot be initialised from outside the translation unit without an upstream BSV source patch.

The per-instance CachingScriptChecker here has none of those costs — no global state, no extra translation units, no gArgs, no init function, and intercepts at the level (CheckSig) that actually saves the dominant work (sighash compute over 490 KB).

Scope

  • Per-instance cache (one input, one verify call).
  • No thread-safety concerns (instance-local).
  • Base TransactionSignatureChecker behaviour is otherwise untouched. Fast path is purely additive.

Test plan

  • CI build produces fresh libGoBDK_{darwin_arm64,darwin_x86_64,linux_aarch64,linux_x86_64}.a archives
  • Repro test in Teranode against the resulting commit hash returns in <1 s where current gobdk v1.2.3 takes >2 h (test fixture: testnet tx 7bc9a340… with 490,001-byte parent locking script, available on request)
  • Existing BDK test suite green
  • No regression in normal-sized script verify perf (overhead per CheckSig: one pointer compare + one ~70-byte FNV-1a + one map lookup, all O(ns) on the hot path)

Future work

A more general sighash-result cache covering the case where scriptCode changes within an EvalScript run (i.e. OP_CODESEPARATOR is present) remains TODO. This patch is targeted at the immediate production stall.

Adds bsv::CachingScriptChecker — a per-instance subclass of
TransactionSignatureChecker that memoises CheckSig results on the
(sig, pubkey, scriptCode-identity) tuple for the duration of one verify
call. Strategy:

- scriptCode identity is tracked via its data pointer and length. Within
  a single EvalScript run on a script that does not contain
  OP_CODESEPARATOR (effectively all modern BSV scripts) the scriptCode
  reference is stable, so pointer+length equality is sufficient and
  avoids the cost of hashing the scriptCode for the cache key.
- sig and pubkey are reduced to 64-bit FNV-1a hashes. A collision can
  only affect a tuple whose first occurrence was already verified
  successfully against the same scriptCode and is therefore txn-scoped
  and benign.
- On OP_CODESEPARATOR (scriptCode changes) the cache is cleared and the
  new scriptCode pointer is recorded — correctness preserved, cache
  thrown away.

scriptengine.cpp:473 now constructs bsv::CachingScriptChecker instead of
the plain TransactionSignatureChecker for the BSV verify path.

Why
---

Post-Genesis BSV permits arbitrarily large locking scripts and removes
the per-script-op limit (MAX_OPS_PER_SCRIPT_AFTER_GENESIS = UINT32_MAX,
MAX_SCRIPT_SIZE_AFTER_GENESIS = UINT32_MAX). A pathological-but-valid
pattern is a long chain of identical (OP_2DUP OP_CHECKSIGVERIFY) pairs
against a single sig+pubkey pushed by the scriptSig. The stack stays at
[sig, pubkey] after each pair, so the script performs N+1 identical
signature verifications.

For an observed case on testnet (block 1,451,505, tx
7bc9a3408dd0c87b835c887a0bce22c20788fc3c4b953929d4367656d80acab5 whose
input spends a 490,001-byte locking script with N=245,000) the plain
TransactionSignatureChecker path runs:
  - 245,001 full ECDSA verifications via libsecp256k1, and
  - 245,001 full SignatureHash computations, each of which SHA256-streams
    the entire 490 KB scriptCode buffer.

That drives validator nodes into multi-hour CPU pegs that present to
operators as a hang in the cgo _Cfunc_ScriptEngine_VerifyScript call.

With this cache the first CheckSig performs one full sighash plus one
ECDSA verify; the remaining ~245,000 calls hit a per-instance hashmap
and return in O(ns) each. Verify time collapses from hours to a few
milliseconds.

Notes
-----

- Cache scope is per-CachingScriptChecker instance, which is one input
  in one verify call. No global mutable state, no init function, no
  thread safety concerns.
- The base TransactionSignatureChecker behaviour is otherwise untouched;
  the fast path is purely additive.
- This is a hot-fix on top of module/gobdk/v1.2.3 to unblock validators
  that are stalled on the above tx. A more general sighash-result cache
  (covering the case where scriptCode changes within an EvalScript run)
  remains future work.
@oskarszoon oskarszoon changed the base branch from master to release/1.2.3 May 11, 2026 10:47
@oskarszoon

Copy link
Copy Markdown
Contributor Author

Reopening with head branch on upstream (workflow's actions/checkout cannot resolve head_ref for cross-fork PRs — see also #39 which failed for the same reason). Pushed hotfix/checker-cache-v1.2.3 to upstream and will open a new PR from there.

@oskarszoon oskarszoon closed this May 11, 2026
oskarszoon added a commit that referenced this pull request May 11, 2026
Master-adapted port of the v1.2.3 hot-fix on the release/1.2.3 branch
(see #38 for the release/1.2.3 PR with full background and rationale).

In master the verify path was renamed from CScriptEngine in
core/scriptengine.cpp to CTxValidator in core/txvalidator.cpp as part of
the v1.2.4 reorganisation, so the swap site moves from
scriptengine.cpp:473 to txvalidator.cpp:505. The bsv::CachingScriptChecker
header (core/checker_cache.hpp) is identical between the two PRs.

The cache addresses a multi-hour validator hang observed on testnet block
1,451,505 / tx 7bc9a3408dd0c87b835c887a0bce22c20788fc3c4b953929d4367656d80acab5
whose input spends a 490,001-byte locking script of
(OP_2DUP OP_CHECKSIGVERIFY) * 245,000 + OP_CHECKSIG. The script is
consensus-valid (post-Genesis removes per-script-op and per-script-size
limits) and forces N+1=245,001 identical signature verifications. Without
the cache each iteration runs a full ECDSA verify plus a full
SignatureHash that SHA256-streams the 490 KB scriptCode buffer; the cache
collapses iterations 2..245,001 into per-instance hashmap lookups
(O(ns) each), turning hours of work into milliseconds.

See #38 for the design discussion,
the rejected alternatives (CachingTransactionSignatureChecker from
script/sigcache.cpp is wrong cache layer + pulls in gArgs dependencies),
and the test plan.
ctnguyen pushed a commit that referenced this pull request May 12, 2026
Master-adapted port of the v1.2.3 hot-fix on the release/1.2.3 branch
(see #38 for the release/1.2.3 PR with full background and rationale).

In master the verify path was renamed from CScriptEngine in
core/scriptengine.cpp to CTxValidator in core/txvalidator.cpp as part of
the v1.2.4 reorganisation, so the swap site moves from
scriptengine.cpp:473 to txvalidator.cpp:505. The bsv::CachingScriptChecker
header (core/checker_cache.hpp) is identical between the two PRs.

The cache addresses a multi-hour validator hang observed on testnet block
1,451,505 / tx 7bc9a3408dd0c87b835c887a0bce22c20788fc3c4b953929d4367656d80acab5
whose input spends a 490,001-byte locking script of
(OP_2DUP OP_CHECKSIGVERIFY) * 245,000 + OP_CHECKSIG. The script is
consensus-valid (post-Genesis removes per-script-op and per-script-size
limits) and forces N+1=245,001 identical signature verifications. Without
the cache each iteration runs a full ECDSA verify plus a full
SignatureHash that SHA256-streams the 490 KB scriptCode buffer; the cache
collapses iterations 2..245,001 into per-instance hashmap lookups
(O(ns) each), turning hours of work into milliseconds.

See #38 for the design discussion,
the rejected alternatives (CachingTransactionSignatureChecker from
script/sigcache.cpp is wrong cache layer + pulls in gArgs dependencies),
and the test plan.
ctnguyen pushed a commit that referenced this pull request May 13, 2026
Master-adapted port of the v1.2.3 hot-fix on the release/1.2.3 branch
(see #38 for the release/1.2.3 PR with full background and rationale).

In master the verify path was renamed from CScriptEngine in
core/scriptengine.cpp to CTxValidator in core/txvalidator.cpp as part of
the v1.2.4 reorganisation, so the swap site moves from
scriptengine.cpp:473 to txvalidator.cpp:505. The bsv::CachingScriptChecker
header (core/checker_cache.hpp) is identical between the two PRs.

The cache addresses a multi-hour validator hang observed on testnet block
1,451,505 / tx 7bc9a3408dd0c87b835c887a0bce22c20788fc3c4b953929d4367656d80acab5
whose input spends a 490,001-byte locking script of
(OP_2DUP OP_CHECKSIGVERIFY) * 245,000 + OP_CHECKSIG. The script is
consensus-valid (post-Genesis removes per-script-op and per-script-size
limits) and forces N+1=245,001 identical signature verifications. Without
the cache each iteration runs a full ECDSA verify plus a full
SignatureHash that SHA256-streams the 490 KB scriptCode buffer; the cache
collapses iterations 2..245,001 into per-instance hashmap lookups
(O(ns) each), turning hours of work into milliseconds.

See #38 for the design discussion,
the rejected alternatives (CachingTransactionSignatureChecker from
script/sigcache.cpp is wrong cache layer + pulls in gArgs dependencies),
and the test plan.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant