fix: increment sequence when accepting new manifests#6059
Conversation
|
|
||
| // Something has changed. Keep track of it. | ||
| seq_++; | ||
|
|
There was a problem hiding this comment.
From: Xahau/xahaud#631
OverlayImpl::getManifestsMessage() caches manifest messages and only rebuilds
them when ManifestCache::sequence() changes. When accepting a new manifest,
the function returned early without incrementing seq_, causing the cache to
never invalidate. This meant peers were exchanging stale lists.
Now seq_++ is called for both new manifests and updates, ensuring the overlay
layer detects changes and sends complete validator lists to connecting peers.
I think this has been masked all these years by validator list (e.g. https://vl.ripple.com/) usage which bootstraps manifests
There was a problem hiding this comment.
The change looks correct to me.
One small request: can we extend Manifest_test to cover the regression where ManifestCache::sequence() wasn’t incremented when accepting a new manifest, and verify it stays unchanged on stale?
With that test in place, I’m happy to approve.
diff --git a/src/test/app/Manifest_test.cpp b/src/test/app/Manifest_test.cpp
index 09054658c9..5276f188c6 100644
--- a/src/test/app/Manifest_test.cpp
+++ b/src/test/app/Manifest_test.cpp
@@ -997,11 +997,16 @@ public:
// applyManifest should accept new manifests with
// higher sequence numbers
+ auto const seq0 = cache.sequence();
BEAST_EXPECT(
cache.applyManifest(clone(s_a0)) ==
ManifestDisposition::accepted);
+ BEAST_EXPECT(cache.sequence() > seq0);
+
+ auto const seq1 = cache.sequence();
BEAST_EXPECT(
cache.applyManifest(clone(s_a0)) == ManifestDisposition::stale);
+ BEAST_EXPECT(cache.sequence() == seq1);
|
Hey, was quite hectic last December, and I missed this Let me sort that out for you |
7280122 to
8eb76e0
Compare
The function was returning early without incrementing seq_++. OverlayImpl uses this sequence to identify/invalidate a cached TMManifests message which is exchanged with peers on connection. Depending on network size, startup sequencing, and topology, this could cause syncing issues.
8eb76e0 to
15d5c91
Compare
|
lgtm! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #6059 +/- ##
=======================================
Coverage 79.9% 79.9%
=======================================
Files 840 840
Lines 65548 65549 +1
Branches 7270 7262 -8
=======================================
+ Hits 52344 52357 +13
+ Misses 13204 13192 -12
🚀 New features to boost your workflow:
|
…manifests-2025-11-20
* chore: Set ColumnLimit to 120 in clang-format (XRPLF#6288) This change updates the ColumnLimit from 80 to 120, and applies clang-format to reformat the code. * chore: Remove unnecessary `boost::system` requirement from conanfile (XRPLF#6290) * chore: Add cmake-format pre-commit hook (XRPLF#6279) This change adds `cmake-format` as. a pre-commit hook. The style file closely matches that in Clio, and they will be made to be equivalent over time. For now, some files have been excluded, as those need some manual adjustments, which will be done in future changes. * chore: Format all cmake files without comments (XRPLF#6294) * ci: Update hashes of XRPLF/actions (XRPLF#6316) This updates the hashes of all XRPLF/actions to their latest versions. * chore: Add upper-case match for ARM64 in CompilationEnv (XRPLF#6315) * fix: Restore config changes that broke standalone mode (XRPLF#6301) When support was added for `xrpld.cfg` in addition to `rippled.cfg` in XRPLF#6098, as part of an effort to rename occurrences of ripple(d) to xrpl(d), the clearing and creation of the data directory were modified for what, at the time, seemed to result in an equivalent code flow. This has turned out to not be true, which is why this change restores two modifications to `Config.cpp` that currently break running the binary in standalone mode. * docs: Update API changelog, add APIv2+APIv3 version documentation (XRPLF#6308) This change cleans up the `API-CHANGELOG.md` file. It moves the version-specific documentation to other files and fleshes out the changelog with all the API-related changes in each version. * chore: Add .zed editor config directory to .gitignore (XRPLF#6317) This change adds the project configuration directory to `.gitignore` for the `zed` editor. As per the [documentation](https://zed.dev/docs/remote-development?highlight=.zed#zed-settings), the project configuration files are stored in the `.zed` directory at the project root dir. * fix: Deletes expired NFToken offers from ledger (XRPLF#5707) This change introduces the `fixExpiredNFTokenOfferRemoval` amendment that allows expired offers to pass through `preclaim()` and be deleted in `doApply()`, following the same pattern used for expired credentials. * refactor: Add ServiceRegistry to help modularization (XRPLF#6222) Currently we're passing the `Application` object around, whereby the `Application` class acts more like a service registry that gives other classes access to other services. In order to allow modularization, we should replace `Application` with a service registry class so that modules depending on `Application` for other services can be moved easily. This change adds the `ServiceRegistry` class. * chore: Remove unity builds (XRPLF#6300) Unity builds were intended to speed up builds, by bundling multiple files into compilation units. However, now that ccache is available on all platforms, there is no need for unity builds anymore, as ccache stores compiled individual build objects for reuse. This change therefore removes the ability to make unity builds. * refactor: Replace include guards by '#pragma once' (XRPLF#6322) This change replaces all include guards in the `src/` and `include/` directories by `#pragma once`. * chore: Remove unnecessary script (XRPLF#6326) * chore: Update secp256k1 and openssl (XRPLF#6327) * fix typo in LendingHelpers unit-test (XRPLF#6215) * fix: Increment sequence when accepting new manifests (XRPLF#6059) The `ManifestCache::applyManifest` function was returning early without incrementing `seq_`. `OverlayImpl `uses this sequence to identify/invalidate a cached `TMManifests` message, which is exchanged with peers on connection. Depending on network size, startup sequencing, and topology, this can cause syncing issues. This change therefore increments `seq_` when a new manifest is accepted. * refactor: Update secp256k1 to 0.7.1 (XRPLF#6331) The latest secp256k1 release, 0.7.1, contains bug fixes that we may benefit from, see https://github.com/bitcoin-core/secp256k1/blob/master/CHANGELOG.md. * chore: Restore unity builds (XRPLF#6328) In certain cases, such as when modifying headers used by many compilation units, performing a unity build is slower than when performing a regular build with `ccache` enabled. There is also a benefit to a unity build in that it can detect things such as macro redefinitions within the group of files that are compiled together as a unit. This change therefore restores the ability to perform unity builds. However, instead of running every configuration with and without unity enabled, it is now only enabled for a single configuration to maintain lower computational use. As part of restoring the code, it became clear that currently two configurations have coverage enabled, since the check doesn't focus specifically on Debian Bookworm so it also applies to Debian Trixie. This has been fixed too in this change. * perf: Remove unnecessary caches (XRPLF#5439) This change removes the cache in `DatabaseNodeImp` and simplifies the caching logic in `SHAMapStoreImp`. As NuDB and RocksDB internally already use caches, additional caches in the code are not very valuable or may even be unnecessary, as also confirmed during preliminary performance analyses. * chore: Remove CODEOWNERS (XRPLF#6337) * test: Add file and line location to Env (XRPLF#6276) This change uses `std::source_location` to output the file and line location of the call that triggered a failed transaction. * refactor: Fix spelling issues in tests (XRPLF#6199) This change removes the `src/tests` exception from the `cspell` config and fixes all the issues that arise as a result. No functionality/test change. * refactor: Change main thread name to `xrpld-main` (XRPLF#6336) This change builds on the thread-renaming PR (XRPLF#6212), by renaming the main thread name to reduce ambiguity in performance monitoring tools. * fix: Update invariant checks for Permissioned Domains (XRPLF#6134) * refactor: Modularize WalletDB and Manifest (XRPLF#6223) This change modularizes the `WalletDB` and `Manifest`. Note that the wallet db has nothing to do with account wallets and it stores node configuration, which is why it depends on the manifest code. * refactor: Modularize RelationalDB (XRPLF#6224) The rdb module was not properly designed, which is fixed in this change. The module had three classes: 1) The abstract class `RelationalDB`. 2) The abstract class `SQLiteDatabase`, which inherited from `RelationalDB` and added some pure virtual methods. 3) The concrete class `SQLiteDatabaseImp`, which inherited from `SQLiteDatabase` and implemented all methods. The updated code simplifies this as follows: * The `SQLiteDatabaseImp` has become `SQLiteDatabase`, and * The former `SQLiteDatabase `has merged with `RelationalDatabase`. * chore: Fix `gcov` lib coverage build failure on macOS (XRPLF#6350) For coverage builds, we try to link against the `gcov` library (specific to the environment). But as macOS doesn't have this library and thus doesn't have the coverage tools to generate reports, the coverage builds on that platform were failing on linking. We actually don't need to explicitly force this linking, as the `CodeCoverage` file already has correct detection logic (currently on lines 177-193), which is invoked when the `--coverage` flag is provided: * AppleClang: Uses `xcrun -f llvm-cov` to set `GCOV_TOOL="llvm-cov gcov"`. * Clang: Finds `llvm-cov` to set `GCOV_TOOL="llvm-cov gcov"`. * GCC: Finds `gcov` to set `GCOV_TOOL="gcov"`. The `GCOV_TOOL` is then passed to `gcovr` on line 416, so the correct tool is used for processing coverage data. This change therefore removes the `gcov` suffix from lines 473 and 475 in the `CodeCoverage.cmake` file. * refactor: Modularize the NetworkOPs interface (XRPLF#6225) This change moves the NetworkOPs interface into `libxrpl` and it leaves its implementation in `xrpld`. * chore: Fix minor issues in comments (XRPLF#6346) * refactor: Modularize `HashRouter`, `Conditions`, and `OrderBookDB` (XRPLF#6226) This change modularizes additional components by moving code to `libxrpl`. * chore: Update clang-format to 21.1.8 (XRPLF#6352) * refactor: Decouple app/tx from `Application` and `Config` (XRPLF#6227) This change decouples app/tx from `Application` and `Config` to clear the way to moving transactors to `libxrpl`. * refactor: Modularize app/tx (XRPLF#6228) * ci: Add clang tidy workflow to ci (XRPLF#6369) * chore: Set cmake-format width to 100 (XRPLF#6386) * chore: Set clang-format width to 100 in config file (XRPLF#6387) * chore: Apply clang-format width 100 (XRPLF#6387) * Fix tautological assertion (XRPLF#6393) * ci: Add dependabot config (XRPLF#6379) * ci: Build docs in PRs and in private repos (XRPLF#6400) * ci: [DEPENDABOT] bump codecov/codecov-action from 5.4.3 to 5.5.2 (XRPLF#6398) Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 5.4.3 to 5.5.2. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@18283e0...671740a) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-version: 5.5.2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix merge conflicts, format, spelling, replace all include guards by #pragma once * Fix non-unity build --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Ayaz Salikhov <mathbunnyru@users.noreply.github.com> Co-authored-by: Ed Hennis <ed@ripple.com> Co-authored-by: Bart <bthomee@users.noreply.github.com> Co-authored-by: Mayukha Vadari <mvadari@ripple.com> Co-authored-by: Vito Tumas <5780819+Tapanito@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: Jingchen <a1q123456@users.noreply.github.com> Co-authored-by: Niq Dudfield <ndudfield@gmail.com> Co-authored-by: Valentin Balaschenko <13349202+vlntb@users.noreply.github.com> Co-authored-by: Olek <115580134+oleks-rip@users.noreply.github.com> Co-authored-by: Pratik Mankawde <3397372+pratikmankawde@users.noreply.github.com> Co-authored-by: nuxtreact <nuxtreact@outlook.com> Co-authored-by: Sergey Kuznetsov <skuznetsov@ripple.com> Co-authored-by: Ayaz Salikhov <asalikhov@ripple.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The function was returning early without incrementing seq_++. OverlayImpl uses this sequence to identify/invalidate a cached TMManifests message which is exchanged with peers on connection. Depending on network size, startup sequencing, and topology, this could cause syncing issues.