refactor: mempool: use CTxMemPool::Limits #26103

stickies-v · 2022-09-15T19:32:12Z

Mempool currently considers 4 limits regarding ancestor and descendant count and size, which get passed around between functions quite a bit. This PR uses CTxMemPool::Limits introduced in #25290 to simplify those signatures and callsites.

The purpose of this PR is to improve readability and maintenance, without behaviour change.

As noted in the first commit "refactor: mempool: change MemPoolLimits members to uint", we currently have an underflow issue where a user could pass a negative -limitancestorsize, which is eventually cast to an unsigned integer. This behaviour already exists. Because it's orthogonal and to minimize scope, I think this should be fixed in a separate PR.

glozow

Concept ACK to using the Limits struct instead of a list of ints. I think this is more readable, especially the static NoLimits when we just want ancestors calculated.

src/validation.cpp

src/test/mempool_tests.cpp

src/kernel/mempool_limits.h

glozow · 2022-09-16T10:26:12Z

src/kernel/mempool_limits.h

Can you explain why the change from signed to unsigned?

I've updated the commit message. Relevant part used to be:

These int64_t members later get cast into a size_t in CTxMemPool::CTxMemPool() anyway

Is now:

These limits represent counts and sizes that should never be negative. Currently, the int64_t members later on in the call stack get cast into a size_t (without bounds checking) in CTxMemPool::CTxMemPool() anyway, so this type change does not introduce any new underflow risks - it just makes them unsigned earlier on in the process to minimize back-and-forth type conversion.

Does that resolve it?

I'm new to the code base so my question may look silly. Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient? Although, I know that's not in the scope of this PR.

Currently, the int64_t members later on in the call stack
get cast into a size_t

Yeah so we have virtual size implemented as size_t, unsigned ints, and signed ints in various places in the codebase, and it would be good to switch to 1 type consistently. I don't think there's a PR to do all of them, but #23962 is one.

My current thinking is to use signed ints instead of size_t because we're doing arithmetic with these limits (see CalculateAncestorsAndCheckLimits, #23962 (comment)). So I always use int64_t to hold virtual sizes. Not sure where we're going long term 🤷, but that's why I don't really agree with switching to unsigned.

Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient?

There's never a need for more than 32 bits, but switching partially would cause integer truncation somewhere.

I understand your reasoning and the one from the linked comment, but I'm not sure it makes sense here. A lot of the comparisons we do are with other size_t's, e.g. from std::set::size(). So, we'd either have to unsafely cast them into an int64_t (probably fine, but... not ideal?) or implement a function that does it with bounds checking etc.

I'm not sure that's preferable? See e.g. the below diff, with some unsafe static_cast as well as implicit conversion (e.g. CheckPackageLimits() passing package.size() to CalculateAncestorsAndCheckLimits())

git diff

diff --git a/src/kernel/mempool_limits.h b/src/kernel/mempool_limits.h index fd0979c6c..584af5679 100644 --- a/src/kernel/mempool_limits.h +++ b/src/kernel/mempool_limits.h @@ -17,20 +17,20 @@ namespace kernel { */ struct MemPoolLimits { //! The maximum allowed number of transactions in a package including the entry and its ancestors. - uint64_t ancestor_count{DEFAULT_ANCESTOR_LIMIT}; + int64_t ancestor_count{DEFAULT_ANCESTOR_LIMIT}; //! The maximum allowed size in virtual bytes of an entry and its ancestors within a package. - uint64_t ancestor_size_vbytes{DEFAULT_ANCESTOR_SIZE_LIMIT_KVB * 1'000}; + int64_t ancestor_size_vbytes{DEFAULT_ANCESTOR_SIZE_LIMIT_KVB * 1'000}; //! The maximum allowed number of transactions in a package including the entry and its descendants. - uint64_t descendant_count{DEFAULT_DESCENDANT_LIMIT}; + int64_t descendant_count{DEFAULT_DESCENDANT_LIMIT}; //! The maximum allowed size in virtual bytes of an entry and its descendants within a package. - uint64_t descendant_size_vbytes{DEFAULT_DESCENDANT_SIZE_LIMIT_KVB * 1'000}; + int64_t descendant_size_vbytes{DEFAULT_DESCENDANT_SIZE_LIMIT_KVB * 1'000}; /** * @return MemPoolLimits with all the limits set to the maximum */ static MemPoolLimits NoLimits() { - uint64_t no_limit{std::numeric_limits<uint64_t>::max()}; + int64_t no_limit{std::numeric_limits<int64_t>::max()}; return {no_limit, no_limit, no_limit, no_limit}; } diff --git a/src/txmempool.cpp b/src/txmempool.cpp index 3097473ac..ee5ffff7b 100644 --- a/src/txmempool.cpp +++ b/src/txmempool.cpp @@ -183,14 +183,14 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256>& vHashes } } -bool CTxMemPool::CalculateAncestorsAndCheckLimits(size_t entry_size, - size_t entry_count, +bool CTxMemPool::CalculateAncestorsAndCheckLimits(int64_t entry_size, + int64_t entry_count, setEntries& setAncestors, CTxMemPoolEntry::Parents& staged_ancestors, const Limits& limits, std::string &errString) const { - size_t totalSizeWithAncestors = entry_size; + int64_t totalSizeWithAncestors = entry_size; while (!staged_ancestors.empty()) { const CTxMemPoolEntry& stage = staged_ancestors.begin()->get(); @@ -241,7 +241,7 @@ bool CTxMemPool::CheckPackageLimits(const Package& package, std::optional<txiter> piter = GetIter(input.prevout.hash); if (piter) { staged_ancestors.insert(**piter); - if (staged_ancestors.size() + package.size() > limits.ancestor_count) { + if (static_cast<int64_t>(staged_ancestors.size() + package.size()) > limits.ancestor_count) { errString = strprintf("too many unconfirmed parents [limit: %u]", limits.ancestor_count); return false; } @@ -277,7 +277,7 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry, std::optional<txiter> piter = GetIter(tx.vin[i].prevout.hash); if (piter) { staged_ancestors.insert(**piter); - if (staged_ancestors.size() + 1 > limits.ancestor_count) { + if (static_cast<int64_t>(staged_ancestors.size() + 1) > limits.ancestor_count) { errString = strprintf("too many unconfirmed parents [limit: %u]", limits.ancestor_count); return false; }

@glozow:

So I always use int64_t to hold virtual sizes. Not sure where we're going long term 🤷, but that's why I don't really agree with switching to unsigned.

I went through all the places where any of the 4 members of MemPoolLimits are accessed. In all of them, except for one (see below), they are immediately cast to a uint. This happens explicitly e.g. here, or implicitly by passing it to a fn/ctor/variable that expects a uint or size_t (e.g. here or here or here).

The only place where we actually use the signedness of any of the members is here:

bitcoin/src/init.cpp

Line 1418 in 9fcdb9f

int64_t descendant_limit_bytes = mempool_opts.limits.descendant_size_vbytes * 40;

I don't think the implicit conversion introduced by this PR is problematic.

For that reason, I would argue that updating the MemPoolLimits members to be uint64_t instead of int64_t (which is necessary to avoid compiler warnings that we're comparing uint with int in all of the places outlined the diff here) is - except for the one case - not a regression, and is the most straightforward implementation for the goal of this PR. I currently have no objection to what is being proposed in #23962, but I feel like changing these members and all subsequent call sites to signed integers is orthogonal and would unnecessarily introduce complexity and controversy for this PR. As such, my preference would be to not do it here.

What do you think? If you agree with making MemPoolLimits members uint64_t, I will do a force push that removes the now unneccessary uint casts such as here.

Ah I see, thank you very much for going through the usage! I agree it seems appropriate to defer to a later PR.

3611c2b, approach NACK.

I do not agree that switching to unsigned type is an improvement. Especially for variables involved in arithmetic and comparison operations.

For example, see https://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf

Thanks for your feedback hebasto. Even though the MemPoolLimits members are almost always cast to an unsigned int so I think it'd be clearer to have everything be in the type in which it's actually used, I understand your concern that you don't want to regress the interface to avoid future PRs building on an unsigned MemPoolLimits interface.

As such, I think the most straightforward way to move this PR forward and preserve as much review time as possible is to remove the commit that made the MemPoolLimits members unsigned and introduce static_cast<uint64_t> wherever necessary. This way the interface remains signed and behaviour remains unchanged (including arithmetic with unsigned integers) to how it was before this PR.

aureleoules

Concept ACK

src/txmempool.h

src/txmempool.cpp

fanquake · 2022-09-16T13:13:36Z

Concept ACK

stickies-v

Force pushed to address review feedback and remove a leftover whitespace. Thank you for the quick feedback everyone!

Main changes:

Introduce CPFPCarveOutLimits() helper function in mempool_limits.h. Currently not used by anyone else, but a bit tidier.
pass Limits by reference instead of value

src/test/mempool_tests.cpp

src/txmempool.cpp

src/validation.cpp

stickies-v · 2022-09-16T14:33:40Z

src/kernel/mempool_limits.h

I've updated the commit message. Relevant part used to be:

These int64_t members later get cast into a size_t in CTxMemPool::CTxMemPool() anyway

Is now:

These limits represent counts and sizes that should never be negative. Currently, the int64_t members later on in the call stack get cast into a size_t (without bounds checking) in CTxMemPool::CTxMemPool() anyway, so this type change does not introduce any new underflow risks - it just makes them unsigned earlier on in the process to minimize back-and-forth type conversion.

Does that resolve it?

aureleoules · 2022-09-16T15:36:38Z

you commited compile_commands.json

stickies-v · 2022-09-16T15:45:02Z

you commited compile_commands.json

whoops sorry, fixed and added that to global gitignore

DrahtBot · 2022-09-17T04:08:19Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25038 (policy: nVersion=3 and Package RBF by glozow)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Riahiamirreza · 2022-09-17T17:17:20Z

As far as I could understand the PR Concept ACK. I think this PR is relatively easy to understand and putting the good-first-review label on it can make sense IMO.

Riahiamirreza · 2022-09-17T17:10:57Z

src/kernel/mempool_limits.h

I'm new to the code base so my question may look silly. Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient? Although, I know that's not in the scope of this PR.

stickies-v · 2022-09-20T12:35:48Z

Force pushed to fix the failing ./test/functional/mempool_package_onemore.py test. In the last refactoring (c5e5952 -> c17d468) I should have passed m_limits when calling CPFPCarveOutLimits() because it was modified earlier in PreChecks() already. Also removed the argument-less overload of CPFPCarveOutLimits() since it's now not used anywhere anymore.

glozow

code review ACK, just 1-2 nits

src/kernel/mempool_limits.h

glozow · 2022-09-28T14:45:22Z

src/kernel/mempool_limits.h

Ah I see, thank you very much for going through the usage! I agree it seems appropriate to defer to a later PR.

src/validation.cpp

stickies-v

Force pushed to incorporate glozow's feedback - thank you for the review!

using copy constructor instead of designated initializer
removed the carved-out CPFPCarveOutLimits() function (~revert to previous version)

src/kernel/mempool_limits.h

src/validation.cpp

glozow

ACK ae3a5f3

aureleoules

ACK ae3a5f3

hebasto · 2022-10-04T09:16:39Z

Concept ACK.

stickies-v · 2022-10-04T14:10:02Z

Force pushed to address hebasto's concerns regarding making the MemPoolLimits interface unsigned, which goes counter to the direction we want to take.

Note: this introduces potential for follow-up improvements in CTxMemPool::CalculateAncestorsAndCheckLimits() where some of the static_casts can be removed by updating the function signature to use int64_t instead of size_t when CTxMemPoolEntry::GetCountWithDescendants() and CTxMemPoolEntry::GetSizeWithDescendants() are updated to return an int64_t. I'd prefer to minimize the

stickies-v · 2022-10-04T15:26:19Z

Force pushed to rebase to fix failing Win64 native CI (unrelated to PR).

aureleoules

re-ACK d0bfe6869855dcf112a7da640e2e8ad648f82bbd.
Since my last review:

rolled back from uint64_t to int64_t for descendant_count, descendant_size_vbytes and no_limit.
added static casts to compare these values against uint64_t types.

I also verified the values being checked against are of type size_t or uint64_t.

hebasto · 2022-10-05T08:43:56Z

Approach ACK d0bfe6869855dcf112a7da640e2e8ad648f82bbd.

src/txmempool.h

src/kernel/mempool_limits.h

src/test/mempool_tests.cpp

There are quite a few places in the codebase that require us to construct a CTxMemPool without limits on ancestors and descendants. This helper function allows us to get rid of all that duplication.

Simplifies function signatures by removing repetition of all the ancestor/descendant limits, and increases readability by being more verbose by naming the limits, while still reducing the LoC.

The (100, 1000000, 1000, 1000000) limits are arbitrarily high and don't restrict anything, they are just meant to calculate ancestors properly. Using NoLimits() makes this intent more clear and simplifies the code.

stickies-v · 2022-10-05T12:11:33Z

Force pushed to address hebasto's review feedback - thank you!

Split out changes to test/mempool_tests.cpp into separate commit
Added commit with improvements to Doxygen comments
Made MemPoolLimits::NoLimits() constexpr

hebasto

ACK 33b12e5, I have reviewed the code and it looks OK, I agree it can be merged.

hebasto · 2022-10-05T16:55:24Z

src/txmempool.cpp

            return false;
-        } else if (totalSizeWithAncestors > limitAncestorSize) {
-            errString = strprintf("exceeds ancestor size limit [limit: %u]", limitAncestorSize);
+        } else if (totalSizeWithAncestors > static_cast<uint64_t>(limits.ancestor_size_vbytes)) {


nit:

Suggested change

} else if (totalSizeWithAncestors > static_cast<uint64_t>(limits.ancestor_size_vbytes)) {

} else if (totalSizeWithAncestors > static_cast<size_t>(limits.ancestor_size_vbytes)) {

I beg a pardon for being pedantic :) Feel free to ignore this nit.

Since this is a refactor PR, I'll leave this as is. limitAncestorSize was a uint64_t before, so better to change this in a future PR imo. I'll leave this comment open for visibility.

glozow

reACK 33b12e5

fanquake added Refactoring Mempool labels Sep 15, 2022

glozow reviewed Sep 16, 2022

View reviewed changes

aureleoules reviewed Sep 16, 2022

View reviewed changes

src/txmempool.h Outdated Show resolved Hide resolved

src/txmempool.cpp Outdated Show resolved Hide resolved

src/txmempool.cpp Outdated Show resolved Hide resolved

stickies-v force-pushed the mempool-simplify-fn-signatures branch from c5e5952 to 9bf8bce Compare September 16, 2022 15:33

stickies-v commented Sep 16, 2022

View reviewed changes

stickies-v force-pushed the mempool-simplify-fn-signatures branch from 9bf8bce to c17d468 Compare September 16, 2022 15:38

Riahiamirreza approved these changes Sep 17, 2022

View reviewed changes

stickies-v force-pushed the mempool-simplify-fn-signatures branch from c17d468 to 55b8e6f Compare September 20, 2022 12:35

DrahtBot mentioned this pull request Sep 22, 2022

policy: nVersion=3 and Package RBF #25038

Closed

glozow reviewed Sep 28, 2022

View reviewed changes

stickies-v force-pushed the mempool-simplify-fn-signatures branch from 55b8e6f to ae3a5f3 Compare September 28, 2022 15:39

stickies-v commented Sep 28, 2022

View reviewed changes

src/kernel/mempool_limits.h Outdated Show resolved Hide resolved

src/validation.cpp Outdated Show resolved Hide resolved

glozow reviewed Sep 29, 2022

View reviewed changes

glozow requested a review from aureleoules October 4, 2022 08:33

aureleoules approved these changes Oct 4, 2022

View reviewed changes

stickies-v force-pushed the mempool-simplify-fn-signatures branch from ae3a5f3 to 11b6df3 Compare October 4, 2022 13:57

stickies-v force-pushed the mempool-simplify-fn-signatures branch from 11b6df3 to d0bfe68 Compare October 4, 2022 15:25

aureleoules approved these changes Oct 5, 2022

View reviewed changes

hebasto reviewed Oct 5, 2022

View reviewed changes

src/txmempool.h Outdated Show resolved Hide resolved

src/txmempool.h Outdated Show resolved Hide resolved

src/kernel/mempool_limits.h Outdated Show resolved Hide resolved

src/test/mempool_tests.cpp Outdated Show resolved Hide resolved

stickies-v added 4 commits October 5, 2022 13:07

refactor: mempool: add MemPoolLimits::NoLimits()

b85af25

There are quite a few places in the codebase that require us to construct a CTxMemPool without limits on ancestors and descendants. This helper function allows us to get rid of all that duplication.

refactor: mempool: use CTxMempool::Limits

3a86f24

Simplifies function signatures by removing repetition of all the ancestor/descendant limits, and increases readability by being more verbose by naming the limits, while still reducing the LoC.

test: use NoLimits() in MempoolIndexingTest

6945853

The (100, 1000000, 1000, 1000000) limits are arbitrarily high and don't restrict anything, they are just meant to calculate ancestors properly. Using NoLimits() makes this intent more clear and simplifies the code.

docs: improve docs where MemPoolLimits is used

33b12e5

stickies-v force-pushed the mempool-simplify-fn-signatures branch from d0bfe68 to 33b12e5 Compare October 5, 2022 12:10

hebasto approved these changes Oct 5, 2022

View reviewed changes

glozow reviewed Oct 9, 2022

View reviewed changes

glozow merged commit d33c589 into bitcoin:master Oct 9, 2022

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 9, 2022

Merge bitcoin#26103: refactor: mempool: use CTxMemPool::Limits

830fd53

fanquake mentioned this pull request Jan 17, 2023

[kernel 3a/n] Decouple CTxMemPool from ArgsManager #25290

Merged

2 tasks

bitcoin locked and limited conversation to collaborators Oct 9, 2023

	} else if (totalSizeWithAncestors > static_cast<uint64_t>(limits.ancestor_size_vbytes)) {
	} else if (totalSizeWithAncestors > static_cast<size_t>(limits.ancestor_size_vbytes)) {

refactor: mempool: use CTxMemPool::Limits #26103

refactor: mempool: use CTxMemPool::Limits #26103

Uh oh!

Conversation

stickies-v commented Sep 15, 2022

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stickies-v Sep 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aureleoules left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fanquake commented Sep 16, 2022

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aureleoules commented Sep 16, 2022

Uh oh!

stickies-v commented Sep 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Sep 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

Riahiamirreza commented Sep 17, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stickies-v commented Sep 20, 2022

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stickies-v left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

glozow left a comment

Choose a reason for hiding this comment

Uh oh!

aureleoules left a comment

Choose a reason for hiding this comment

stickies-v Sep 27, 2022 •

edited

Loading

stickies-v commented Sep 16, 2022 •

edited

Loading

DrahtBot commented Sep 17, 2022 •

edited

Loading