Skip to content

feat: Data reliability removal#2134

Merged
nimrod-teich merged 42 commits into
mainfrom
data_reliability_removal
Jan 8, 2026
Merged

feat: Data reliability removal#2134
nimrod-teich merged 42 commits into
mainfrom
data_reliability_removal

Conversation

@AnnaR-prog

Copy link
Copy Markdown
Contributor

Description

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • read the contribution guide
  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the main branch
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

@github-actions

github-actions Bot commented Dec 7, 2025

Copy link
Copy Markdown

Test Results

    7 files  ± 0    128 suites   - 3   33m 31s ⏱️ -55s
3 185 tests  - 34  3 184 ✅  - 34  1 💤 ±0  0 ❌ ±0 
3 265 runs   - 34  3 264 ✅  - 34  1 💤 ±0  0 ❌ ±0 

Results for commit cc2ec0e. ± Comparison against base commit 087295c.

This pull request removes 80 and adds 46 tests. Note that renamed tests count towards both.
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictBasicResponseCheck
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictBasicResponseCheck/multiple_providers_-_everyone_is_returning_fake_block_hashes_-_return_conflict_error
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictBasicResponseCheck/multiple_providers_-_only_one_with_fake_block_hashes_-_return_ok
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictBasicResponseCheck/one_provider_-_fake_block_hashes_-_return_conflict_error
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictReport
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictReport/same_provider_conflict_report
github.com/lavanet/lava/v5/protocol/integration ‑ TestSameProviderConflictReport/two_providers_conflict_report
github.com/lavanet/lava/v5/protocol/lavaprotocol/finalizationconsensus ‑ TestConsensusHashesInsertion
github.com/lavanet/lava/v5/protocol/lavaprotocol/finalizationconsensus ‑ TestConsensusHashesInsertion/APT1:happy-flow
github.com/lavanet/lava/v5/protocol/lavaprotocol/finalizationconsensus ‑ TestConsensusHashesInsertion/APT1:happy-flow-with-gap
…
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_AllProvidersWithZeroOrNegativeBlocks
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_ClockSkew_FutureTime
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_ClockSkew_OldObservation
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_ConcurrentAccess
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_ExpectedBlockHeightNeverNegative
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_IgnoreInvalidInputs
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_InterpolateBlocks
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_InterpolateBlocks/fractional_block
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_InterpolateBlocks/future_time_(negative)
github.com/lavanet/lava/v5/protocol/relaycore ‑ TestLatestBlockEstimator_InterpolateBlocks/no_time_passed
…

♻️ This comment has been updated with latest results.

AnnaR-prog and others added 24 commits January 5, 2026 17:44
Disable Data Reliability feature without breaking changes:

This is Phase 1 (Conservative approach):
- All DR functions kept (not deleted, just not called)
- All packages kept (finalizationconsensus, reliabilitymanager, x/conflict)
- All proto fields kept (FinalizedBlocksHashes, SigBlocks)
- Fully reversible - can uncomment code blocks if needed
- No proto/gRPC breaking changes
Phase 2 DR removal - eliminate DataReliabilityParams() interface method.
Remove from ChainParser interface and all implementations.
Update SignRelayResponse() to remove dataReliabilityEnabled parameter.
Replace all calls with hardcoded behavior (always disabled).
Update tests and TypeScript SDK.
Proto fields remain for backward compatibility.
Phase 2 DR removal - delete finalizationconsensus, finalizationverification, reliabilitymanager packages.

Deleted directories:
- protocol/lavaprotocol/finalizationconsensus/
- protocol/lavaprotocol/finalizationverification/
- protocol/rpcprovider/reliabilitymanager/

Removed all references:
- Deleted vote_updater.go (only used for reliability manager)
- Removed VoteEvents(), getLatestVoteEvents() methods from updaters
- Removed SendVoteReveal(), SendVoteCommitment() from state trackers
- Removed RegisterReliabilityManagerForVoteUpdates() interface method
- Removed reliabilityManager parameter from ServeRPCRequests()
- Removed all imports and mock methods

x/conflict module remains intact (dormant) - these were client libraries.
Build status: ✅ Compiles successfully
Fixed nil pointer panic at rpcprovider_server.go:186 that occurred after
removing data reliability components (finalizationconsensus,
finalizationverification, reliabilitymanager).

Changes:
- Replaced all reliabilityManager references with chainTracker in provider code
- reliabilityManager was just a wrapper around chainTracker methods
  (GetLatestBlockData, GetLatestBlockNum)
- Updated RPCProviderServer to use chainTracker directly
- Fixed ServeRPCRequests to accept chainTracker parameter
- Updated all tests (unit and integration) to use MockChainTracker
- Removed reliabilitymanager imports

Part of Data Reliability removal - Phase 2
Completed Phase 2 cleanup by removing final unused DR functions and fields:

1. RelayProcessor (relaycore/relay_processor.go):
   - Removed skipDataReliability field
   - Removed SetSkipDataReliability() and GetSkipDataReliability() methods

2. Request Builder (lavaprotocol/request_builder.go):
   - Removed VerifyReliabilityResults() function
   - Removed compareRelaysFindConflict() function
   - Removed findFirstDifferentChar() helper function
   - Removed unused imports: bytes, common, conflicttypes, conflictconstruct

3. State Tracker (statetracker/tx_sender.go):
   - Removed TxSenderConflictDetection() function
   - Removed conflicttypes import

Part of Data Reliability removal - Phase 2
Removed all 'Data Reliability disabled - Phase 2' comments from the codebase.

Kept only one clean comment in RPCProviderServer struct to document that
chainTracker replaced reliabilityManager (which was just a wrapper).

Also removed:
- Last two SetSkipDataReliability() calls that were missed
- Cleaned up empty lines left by comment removal
All DR functionality has been completely removed from Phase 2.
- Remove all 'Data Reliability disabled - Phase 2' comments (29 occurrences)
- Fix mockReliabilityManager bug: rename to mockChainTracker
- Rename DataReliabilityTimeoutIncrease to CacheWriteTimeout
- Remove Data Reliability session infrastructure:
  - Delete DataReliabilitySessionId, DataReliabilityRelayNumber, DataReliabilityCuSum constants
  - Delete getDataReliabilitySingleSession() (dead code - never called)
  - Delete PrepareDataReliabilitySessionForUsage() and onDataReliabilitySessionFailure()
  - Remove isDataReliability field from ProviderSessionsWithConsumerProject
  - Remove Session ID 0 reservation and skip logic
  - Delete DataReliabilitySession struct
  - Remove 6 DR-related error types
- Fix MockChainTracker to properly implement IChainTracker interface
- Update TypeScript SDK to remove DR comments

Impact: Session ID 0 now available for use, simplified session management
…updating calls in consumer_ws_subscription_manager_test.go
Remove unused named return parameters (success, err) that were causing
ineffassign linter error. The function now uses unnamed return types.
1. Fix TestRelayInnerProviderUniqueIdFlow failure:
   - Always reset ProviderTrailer before each relay to ensure fresh trailer data
   - Prevents stale trailer values from persisting across multiple relay calls
   - This ensures provider unique ID verification works correctly

2. Remove obsolete TestSameProviderConflictBasicResponseCheck:
   - Test was checking for conflict detection which was removed in DR removal
   - Related test TestSameProviderConflictReport was already removed
   - This test no longer serves a purpose without Data Reliability feature
…dRelays

Same issue as in RPCConsumerServer - the sendCraftedRelays function had
unused named return parameters (success, err) causing ineffassign linter
error. Changed to unnamed return types and ignored the unused err from
craftRelay since it's already handled by the ok boolean.
…mer pattern

The test was failing in CI because the mock only checked for exactly 1 option,
but the actual gRPC call may pass multiple options. Updated to:
1. Loop through all opts to find TrailerCallOption (not just len(opts)==1)
2. Reset trailer to nil before each call in the test helper

This matches the pattern used in rpcconsumer test which passes successfully.
Temporarily disable the provider-side cache test as the caching logic is currently tied to finalized block data, which is not handled after the removal of Data Reliability. The test will be re-enabled once the caching mechanism is refactored to decouple it from Data Reliability.
Deleted the backup file chain_router.go.backup as it is no longer needed. This helps in maintaining a cleaner codebase.
- Added missing import for protocol/provideroptimizer package
- Required for CumulativeProbabilityFunctionForPoissonDist() used in line 1324
- Resolves compilation error after rebasing data_reliability_removal branch onto main
@AnnaR-prog AnnaR-prog force-pushed the data_reliability_removal branch from e38cb4a to ffa852c Compare January 6, 2026 10:47
AnnaR-prog and others added 17 commits January 6, 2026 14:36
- Created GetParametersForCache() function to calculate cache parameters without DR
- Extracts latestBlock from chainTracker.GetAtomicLatestBlockNum()
- Gets requestedBlockHash from chainTracker.GetLatestBlockData() for specific blocks
- Calculates finalized status using spectypes.IsFinalizedBlock()
- Cache now works independently of Data Reliability feature

Technical details:
- latestBlock is used as SeenBlock in cache for inter-consumer consistency
- ChainTracker polls node periodically (adaptive based on block time) to keep latestBlock fresh
- reply.LatestBlock is NOT populated by node (only reply.Data), so must calculate separately
- Supports archive functionality and provider seed block correctly

Fixes broken cache that was always using finalized=false and requestedBlockHash=nil
- Set reply.LatestBlock = latestBlock before signing response
- This replaces the behavior previously done by BuildRelayFinalizedBlockHashes()
- Consumers use reply.LatestBlock for consistency tracking and provider selection
- Without this, reply.LatestBlock would be 0 (node only populates reply.Data)

Critical for:
- Consumer consistency tracking (SetSeenBlock)
- Provider selection based on block height
- Archive extension decisions
- QoS metrics that depend on block information
…moval

When Data Reliability was removed, code was changed to always use
VerificationsOnlyChainFetcher which doesn't poll for new blocks.
This broke the lava_latest_block metric on provider side.

Restored full ChainFetcher (previously used when DR was enabled) which:
- Actively polls node for new blocks
- Triggers recordMetricsOnNewBlock callback
- Updates lava_latest_block metric properly

Also added comprehensive [METRIC-TRACE] logging to track:
- ChainTracker callback triggers
- Block detection and processing
- Metric updates

Fixes provider metrics that were not being populated.
- testutil/common/common.go: return acc explicitly instead of naked return
- testutil/keeper/keepers_init.go: return err explicitly instead of naked return
- protocol/chainlib/provider_node_subscription_manager.go: fix comment indentation
- protocol/integration/protocol_test.go: remove extra blank line
- Add --cache-latest-block flag (default: false) to control caching of LATEST_BLOCK requests
- Convert LATEST_BLOCK (-2) to actual block number before caching to comply with cache service validation
- Finalized blocks are always cached regardless of this flag
- Cache service rejects negative block numbers, so conversion is required for LATEST_BLOCK
- When disabled, only historical/finalized blocks are cached
- Changed log messages in rpcprovider_server.go and rpcprovider.go from LavaFormatInfo to LavaFormatDebug for cache and metric trace events.
- This change enhances log verbosity for debugging purposes without altering the functionality of the cache operations.
- Deleted the old provider_state_machine.go and rpcprovider_server.go.backup files as they are no longer needed.
- This cleanup helps streamline the codebase and reduces clutter from obsolete files.
…derSideCache

- Re-add missing cache hit assertions that were removed in cc3519d
- Test now properly verifies that the provider-side cache works
- Validates that 2 cache hits occur for repeated requests to block 1000
- Confirms finalization cache functionality after DR removal
… provider observations

- Add cleanup logic in Record() to remove observations older than 1 hour
- Prevents unbounded memory growth from inactive/offline providers
- Cleanup happens automatically during normal operation (no background goroutine)
- Map size is now bounded by number of active providers (those reporting within last hour)
- Addresses CR feedback about memory leak in provider observation tracking
- Updated `TestChainTrackerCallbacks` to track actual blocks processed and ensure total blocks processed matches total advancement.
- Introduced new tests for `LatestBlockEstimator` to cover various scenarios including handling of observations, clock skew, and invalid inputs.
- Added comprehensive tests for `GetParametersForCache` in `rpcprovider` to validate behavior for different block requests and error handling.
- Improved mock implementations to facilitate testing of chain tracker and RPC provider functionalities.
…provider_server_test.go

- Add AppendHeader implementation for mockChainMessageForProviderHeader
- Remove unused imports: context, pairingtypes, spectypes
- Fixes linter errors introduced in previous test additions
- Fix trailing whitespace in cache_parameters_test.go
- Fix indentation in rpcprovider_server.go
- Fix trailing newline in rpcprovider_server_test.go
- Addresses CI gofmt/gofumpt linter errors
TestGetParametersForCache_LatestBlock was incorrectly expecting LATEST_BLOCK
to be finalized. By definition, the latest/current block cannot be finalized
as it needs to age by blockDistanceToFinalization blocks first.

The finalization check: requestedBlock <= latestBlock - finalizationCriteria
For LATEST_BLOCK: 1000 <= 1000 - 15 → 1000 <= 985 → FALSE (correct)

Changed expectation from require.True to require.False with correct explanation.
@nimrod-teich nimrod-teich merged commit 18cb6d0 into main Jan 8, 2026
28 checks passed
@nimrod-teich nimrod-teich deleted the data_reliability_removal branch January 8, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants