feat: add tx validation caching#608
Conversation
Implement separate caches for mempool and pre_dispatch validation: - Soft cache (tx_hash only): Used for mempool revalidation to avoid redundant ZK proof verification during periodic housekeeping - Strict cache (state_hash + tx_hash): Used for pre_dispatch to ensure state-dependent validation is performed correctly Add versioned host API (v2) with `strict` parameter while maintaining backwards compatibility with legacy callers that default to strict=true. Update pallet to pass strict=false for mempool validation and strict=true for pre_dispatch validation.
…d() calls Change the strict cache to store VerifiedTransaction (via type erasure) instead of just Result<(), Error>. This allows validate_guaranteed_execution to reuse the cached VerifiedTransaction from validate_unsigned(strict=true), avoiding a redundant call to well_formed() which performs expensive ZK proof verification. Flow: 1. validate_unsigned(strict=true) calls well_formed() and caches VerifiedTransaction 2. validate_guaranteed_execution retrieves cached VerifiedTransaction 3. apply() dry-run uses the cached value directly The strict cache uses Arc<dyn Any + Send + Sync> for type erasure since Bridge<S, D> is generic over multiple signature and database types.
…ied_transaction - Add get_verified_transaction() as single source of truth for cache logic - Simplify do_validate_transaction and do_validate_guaranteed_execution - Add apply_verified_transaction to Ledger for using pre-verified transactions - Update apply_transaction to use cached VerifiedTransaction when available - Pass runtime_version to apply_transaction for cache key generation
…e methods Remove dead code introduced by centralized caching: - Remove Ledger::apply_transaction (replaced by apply_verified_transaction) - Remove Ledger::validate_transaction - Remove Transaction::validate - Update tests to use well_formed() directly
Add info/warn logs at key points in transaction processing: - When a transaction is validated for the mempool (or rejected) - When a transaction is rejected at pre-dispatch - When a transaction is applied to the ledger
Add observability for the two-tier transaction validation caches: - ledger_tx_validation_cache_hits_total (labeled by cache_type: strict/soft) - ledger_tx_validation_cache_misses_total - ledger_tx_validation_cache_size (labeled by cache_type: strict/soft) Extend TransactionValidationWasCached enum to distinguish between SoftCache and StrictCache hits, enabling accurate metric reporting.
…atch Remove the redundant validate_transaction call from pre_dispatch since validate_guaranteed_execution already runs well_formed() via get_verified_transaction. This also removes the strict flag that was added to support two-tier caching for pre_dispatch vs mempool, as it's no longer needed. Changes: - Simplify pre_dispatch to only call validate_guaranteed_execution - Remove strict parameter from validate_transaction and do_validate_transaction - Remove v2 versioned APIs from LedgerBridge and LedgerBridgeHf traits - Move strict cache metrics to validate_guaranteed_execution - Remove unused TransactionValidationWasCached enum
gilescope
left a comment
There was a problem hiding this comment.
For the non-strick tx pool we need to be caching negatives.
|
@ozgb |
Code Review Recommendations for PR #608Author: Rust/Substrate Code Review Agent Overall: ✅ Approve - Well-designed two-tier caching mechanism with good documentation and observability. High Priority (Clarification needed)
Medium Priority (Follow-up PRs)
Low Priority (Nice to have)
|
Test Suite Review Recommendations for PR #608Author: Test Suite Review Agent Overall: Test Coverage Summary
What's Tested
High Priority (Follow-up PRs)
Medium Priority (Follow-up PRs)
Conclusion: PR can proceed to merge. E2E tests provide implicit coverage of the validation path. Dedicated unit tests for caching behavior would strengthen confidence but are not blocking. |
Documentation Review Recommendations for PR #608Author: Documentation Review Agent Overall: ✅ Good - Thorough inline documentation with minor gaps in operator-facing docs. Documentation Coverage Summary
What's Documented Well
Medium Priority (Follow-up PRs)
Low Priority (Nice to have)
Conclusion: Documentation is acceptable for merge. Inline Rust docs are thorough. Prometheus metrics documentation gap is a general project concern rather than specific to this PR. |
gilescope
left a comment
There was a problem hiding this comment.
Let's see how it performs in QANet. It's been a while since the negative caching was an issue.
This PR still uses the existing tx pool - it just adds caching for the expensive |
* Unified toml files parameters
* Unified toml files parameters Signed-off-by: Mike Clay <mike.clay@shielded.io>
* Unified toml files parameters Signed-off-by: Mike Clay <mike.clay@shielded.io>








Overview
Adds two separate transaction validation caches - one for use in the mempool, one for pre-dispatch & tx application
The cache for the mempool is keyed by tx hash only - this means we get cache hits even if the ledger state progresses.
For pre-dispatch, we use a more-strict
(ledger_state, tx_hash)keyed cache. This means if a transaction was valid in the mempool, but the ledger state has progressed such that it's no longer valid, we catch that error here.If a tx is added to the mempool and included in the same block, the node will now only run the transaction validation once!
TODO for future PRs:
BridgeandLedgerstructs in themindight-node-ledgercrate is unclearTicket: https://shielded.atlassian.net/browse/PM-21592
🗹 TODO before merging
📌 Submission Checklist
🧪 Testing Evidence
Please describe any additional testing aside from CI:
🔱 Fork Strategy
Links