-
Notifications
You must be signed in to change notification settings - Fork 38.7k
refactor: mempool: use CTxMemPool::Limits #26103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: mempool: use CTxMemPool::Limits #26103
Conversation
glozow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK to using the Limits struct instead of a list of ints. I think this is more readable, especially the static NoLimits when we just want ancestors calculated.
src/kernel/mempool_limits.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why the change from signed to unsigned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the commit message. Relevant part used to be:
These int64_t members later
get cast into a size_t in CTxMemPool::CTxMemPool() anyway
Is now:
These limits represent counts and sizes that should never be
negative. Currently, the int64_t members later on in the call stack
get cast into a size_t (without bounds checking) in
CTxMemPool::CTxMemPool() anyway, so this type change does not
introduce any new underflow risks - it just makes them
unsigned earlier on in the process to minimize back-and-forth
type conversion.
Does that resolve it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to the code base so my question may look silly. Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient? Although, I know that's not in the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the int64_t members later on in the call stack
get cast into a size_t
Yeah so we have virtual size implemented as size_t, unsigned ints, and signed ints in various places in the codebase, and it would be good to switch to 1 type consistently. I don't think there's a PR to do all of them, but #23962 is one.
My current thinking is to use signed ints instead of size_t because we're doing arithmetic with these limits (see CalculateAncestorsAndCheckLimits, #23962 (comment)). So I always use int64_t to hold virtual sizes. Not sure where we're going long term 🤷, but that's why I don't really agree with switching to unsigned.
Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient?
There's never a need for more than 32 bits, but switching partially would cause integer truncation somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your reasoning and the one from the linked comment, but I'm not sure it makes sense here. A lot of the comparisons we do are with other size_t's, e.g. from std::set::size(). So, we'd either have to unsafely cast them into an int64_t (probably fine, but... not ideal?) or implement a function that does it with bounds checking etc.
I'm not sure that's preferable? See e.g. the below diff, with some unsafe static_cast as well as implicit conversion (e.g. CheckPackageLimits() passing package.size() to CalculateAncestorsAndCheckLimits())
git diff
diff --git a/src/kernel/mempool_limits.h b/src/kernel/mempool_limits.h
index fd0979c6c..584af5679 100644
--- a/src/kernel/mempool_limits.h
+++ b/src/kernel/mempool_limits.h
@@ -17,20 +17,20 @@ namespace kernel {
*/
struct MemPoolLimits {
//! The maximum allowed number of transactions in a package including the entry and its ancestors.
- uint64_t ancestor_count{DEFAULT_ANCESTOR_LIMIT};
+ int64_t ancestor_count{DEFAULT_ANCESTOR_LIMIT};
//! The maximum allowed size in virtual bytes of an entry and its ancestors within a package.
- uint64_t ancestor_size_vbytes{DEFAULT_ANCESTOR_SIZE_LIMIT_KVB * 1'000};
+ int64_t ancestor_size_vbytes{DEFAULT_ANCESTOR_SIZE_LIMIT_KVB * 1'000};
//! The maximum allowed number of transactions in a package including the entry and its descendants.
- uint64_t descendant_count{DEFAULT_DESCENDANT_LIMIT};
+ int64_t descendant_count{DEFAULT_DESCENDANT_LIMIT};
//! The maximum allowed size in virtual bytes of an entry and its descendants within a package.
- uint64_t descendant_size_vbytes{DEFAULT_DESCENDANT_SIZE_LIMIT_KVB * 1'000};
+ int64_t descendant_size_vbytes{DEFAULT_DESCENDANT_SIZE_LIMIT_KVB * 1'000};
/**
* @return MemPoolLimits with all the limits set to the maximum
*/
static MemPoolLimits NoLimits()
{
- uint64_t no_limit{std::numeric_limits<uint64_t>::max()};
+ int64_t no_limit{std::numeric_limits<int64_t>::max()};
return {no_limit, no_limit, no_limit, no_limit};
}
diff --git a/src/txmempool.cpp b/src/txmempool.cpp
index 3097473ac..ee5ffff7b 100644
--- a/src/txmempool.cpp
+++ b/src/txmempool.cpp
@@ -183,14 +183,14 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256>& vHashes
}
}
-bool CTxMemPool::CalculateAncestorsAndCheckLimits(size_t entry_size,
- size_t entry_count,
+bool CTxMemPool::CalculateAncestorsAndCheckLimits(int64_t entry_size,
+ int64_t entry_count,
setEntries& setAncestors,
CTxMemPoolEntry::Parents& staged_ancestors,
const Limits& limits,
std::string &errString) const
{
- size_t totalSizeWithAncestors = entry_size;
+ int64_t totalSizeWithAncestors = entry_size;
while (!staged_ancestors.empty()) {
const CTxMemPoolEntry& stage = staged_ancestors.begin()->get();
@@ -241,7 +241,7 @@ bool CTxMemPool::CheckPackageLimits(const Package& package,
std::optional<txiter> piter = GetIter(input.prevout.hash);
if (piter) {
staged_ancestors.insert(**piter);
- if (staged_ancestors.size() + package.size() > limits.ancestor_count) {
+ if (static_cast<int64_t>(staged_ancestors.size() + package.size()) > limits.ancestor_count) {
errString = strprintf("too many unconfirmed parents [limit: %u]", limits.ancestor_count);
return false;
}
@@ -277,7 +277,7 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry,
std::optional<txiter> piter = GetIter(tx.vin[i].prevout.hash);
if (piter) {
staged_ancestors.insert(**piter);
- if (staged_ancestors.size() + 1 > limits.ancestor_count) {
+ if (static_cast<int64_t>(staged_ancestors.size() + 1) > limits.ancestor_count) {
errString = strprintf("too many unconfirmed parents [limit: %u]", limits.ancestor_count);
return false;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I always use int64_t to hold virtual sizes. Not sure where we're going long term 🤷, but that's why I don't really agree with switching to unsigned.
I went through all the places where any of the 4 members of MemPoolLimits are accessed. In all of them, except for one (see below), they are immediately cast to a uint. This happens explicitly e.g. here, or implicitly by passing it to a fn/ctor/variable that expects a uint or size_t (e.g. here or here or here).
The only place where we actually use the signedness of any of the members is here:
Line 1418 in 9fcdb9f
| int64_t descendant_limit_bytes = mempool_opts.limits.descendant_size_vbytes * 40; |
I don't think the implicit conversion introduced by this PR is problematic.
For that reason, I would argue that updating the MemPoolLimits members to be uint64_t instead of int64_t (which is necessary to avoid compiler warnings that we're comparing uint with int in all of the places outlined the diff here) is - except for the one case - not a regression, and is the most straightforward implementation for the goal of this PR. I currently have no objection to what is being proposed in #23962, but I feel like changing these members and all subsequent call sites to signed integers is orthogonal and would unnecessarily introduce complexity and controversy for this PR. As such, my preference would be to not do it here.
What do you think? If you agree with making MemPoolLimits members uint64_t, I will do a force push that removes the now unneccessary uint casts such as here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, thank you very much for going through the usage! I agree it seems appropriate to defer to a later PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3611c2b, approach NACK.
I do not agree that switching to unsigned type is an improvement. Especially for variables involved in arithmetic and comparison operations.
For example, see https://www.aristeia.com/Papers/C++ReportColumns/sep95.pdf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback hebasto. Even though the MemPoolLimits members are almost always cast to an unsigned int so I think it'd be clearer to have everything be in the type in which it's actually used, I understand your concern that you don't want to regress the interface to avoid future PRs building on an unsigned MemPoolLimits interface.
As such, I think the most straightforward way to move this PR forward and preserve as much review time as possible is to remove the commit that made the MemPoolLimits members unsigned and introduce static_cast<uint64_t> wherever necessary. This way the interface remains signed and behaviour remains unchanged (including arithmetic with unsigned integers) to how it was before this PR.
aureleoules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
|
Concept ACK |
c5e5952 to
9bf8bce
Compare
stickies-v
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Force pushed to address review feedback and remove a leftover whitespace. Thank you for the quick feedback everyone!
Main changes:
- Introduce
CPFPCarveOutLimits()helper function inmempool_limits.h. Currently not used by anyone else, but a bit tidier. - pass
Limitsby reference instead of value
src/kernel/mempool_limits.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the commit message. Relevant part used to be:
These int64_t members later
get cast into a size_t in CTxMemPool::CTxMemPool() anyway
Is now:
These limits represent counts and sizes that should never be
negative. Currently, the int64_t members later on in the call stack
get cast into a size_t (without bounds checking) in
CTxMemPool::CTxMemPool() anyway, so this type change does not
introduce any new underflow risks - it just makes them
unsigned earlier on in the process to minimize back-and-forth
type conversion.
Does that resolve it?
|
you commited |
9bf8bce to
c17d468
Compare
whoops sorry, fixed and added that to global gitignore |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
As far as I could understand the PR Concept ACK. I think this PR is relatively easy to understand and putting the |
src/kernel/mempool_limits.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm new to the code base so my question may look silly. Why the type of ancestor and descendant count are 64 bit? Isn't 32 bit more than sufficient? Although, I know that's not in the scope of this PR.
c17d468 to
55b8e6f
Compare
|
Force pushed to fix the failing |
glozow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code review ACK, just 1-2 nits
src/kernel/mempool_limits.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, thank you very much for going through the usage! I agree it seems appropriate to defer to a later PR.
55b8e6f to
ae3a5f3
Compare
stickies-v
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Force pushed to incorporate glozow's feedback - thank you for the review!
- using copy constructor instead of designated initializer
- removed the carved-out
CPFPCarveOutLimits()function (~revert to previous version)
glozow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK ae3a5f3
aureleoules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK ae3a5f3
|
Concept ACK. |
ae3a5f3 to
11b6df3
Compare
|
Force pushed to address hebasto's concerns regarding making the Note: this introduces potential for follow-up improvements in |
11b6df3 to
d0bfe68
Compare
|
Force pushed to rebase to fix failing Win64 native CI (unrelated to PR). |
aureleoules
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK d0bfe6869855dcf112a7da640e2e8ad648f82bbd.
Since my last review:
- rolled back from
uint64_ttoint64_tfordescendant_count,descendant_size_vbytesandno_limit. - added static casts to compare these values against
uint64_ttypes.
I also verified the values being checked against are of type size_t or uint64_t.
|
Approach ACK d0bfe6869855dcf112a7da640e2e8ad648f82bbd. |
There are quite a few places in the codebase that require us to construct a CTxMemPool without limits on ancestors and descendants. This helper function allows us to get rid of all that duplication.
Simplifies function signatures by removing repetition of all the ancestor/descendant limits, and increases readability by being more verbose by naming the limits, while still reducing the LoC.
The (100, 1000000, 1000, 1000000) limits are arbitrarily high and don't restrict anything, they are just meant to calculate ancestors properly. Using NoLimits() makes this intent more clear and simplifies the code.
d0bfe68 to
33b12e5
Compare
|
Force pushed to address hebasto's review feedback - thank you!
|
hebasto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 33b12e5, I have reviewed the code and it looks OK, I agree it can be merged.
| return false; | ||
| } else if (totalSizeWithAncestors > limitAncestorSize) { | ||
| errString = strprintf("exceeds ancestor size limit [limit: %u]", limitAncestorSize); | ||
| } else if (totalSizeWithAncestors > static_cast<uint64_t>(limits.ancestor_size_vbytes)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| } else if (totalSizeWithAncestors > static_cast<uint64_t>(limits.ancestor_size_vbytes)) { | |
| } else if (totalSizeWithAncestors > static_cast<size_t>(limits.ancestor_size_vbytes)) { |
I beg a pardon for being pedantic :) Feel free to ignore this nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a refactor PR, I'll leave this as is. limitAncestorSize was a uint64_t before, so better to change this in a future PR imo. I'll leave this comment open for visibility.
glozow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 33b12e5
Mempool currently considers 4 limits regarding ancestor and descendant count and size, which get passed around between functions quite a bit. This PR uses
CTxMemPool::Limitsintroduced in #25290 to simplify those signatures and callsites.The purpose of this PR is to improve readability and maintenance, without behaviour change.
As noted in the first commit "refactor: mempool: change MemPoolLimits members to uint", we currently have an underflow issue where a user could pass a negative
-limitancestorsize, which is eventually cast to an unsigned integer. This behaviour already exists. Because it's orthogonal and to minimize scope, I think this should be fixed in a separate PR.