Skip to content

refactor: use _MiB/_GiB consistently for byte conversions#34435

Open
l0rinc wants to merge 2 commits intobitcoin:masterfrom
l0rinc:util/byte-units-mib-gib
Open

refactor: use _MiB/_GiB consistently for byte conversions#34435
l0rinc wants to merge 2 commits intobitcoin:masterfrom
l0rinc:util/byte-units-mib-gib

Conversation

@l0rinc
Copy link
Contributor

@l0rinc l0rinc commented Jan 28, 2026

Problem

Byte-size conversions in the codebase currently show up in many equivalent formats (multiplication/division chains, shifts, hex/binary literals), which creates a maintenance burden and makes review error-prone - especially considering the architectural differences of size_t.
Inspired by #34305 (comment), it seemed appropriate to unify Mebibyte usage across the codebase and add Gibibyte support with 32/64 bit size_t validation.

Fix

This PR refactors those call sites to use ""_MiB (existing) and ""_GiB (new), and adds the encountered value/pattern replacements to unit tests to make review straightforward, and to ensure the conversions remain valid.
The literals are overflow-checked when converting to size_t, and unit tests cover the 32-bit boundary cases.

Concretely, it replaces patterns such as:

  • 1024*1024, 1<<20, 0x100000, 1048576, / 1024 / 1024, * (1.0 / 1024 / 1024)1_MiB or double(1_MiB)
  • 1024*1024*1024, 1<<30, 0x40000000, 1024_MiB, >> 301_GiB

Note

In the few places where arithmetic involves signed values, the result is identical to the previous code assuming those quantities never become negative.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 28, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK sedited
Stale ACK hodlinator

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #34641 (node: scale default -dbcache with system RAM by l0rinc)
  • #34639 (iwyu: Document or remove some pragma: export and other improvements by hebasto)
  • #34448 (ci, iwyu: Fix warnings in src/util and treat them as errors by hebasto)
  • #34132 (coins: drop error catcher, centralize fatal read handling by l0rinc)
  • #34038 (logging: replace -loglevel with -trace, various API improvements by ajtowns)
  • #33259 (rpc, logging: add backgroundvalidation to getblockchaininfo by polespinasa)
  • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
  • #29678 (Bugfix: Correct first-run free space checks by luke-jr)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

  • BlockFilterIndex(interfaces::MakeChain(m_node), BlockFilterType::BASIC, 1_MiB, true) in src/test/blockfilter_index_tests.cpp
  • InitBlockFilterIndex([&]{ return interfaces::MakeChain(m_node); }, BlockFilterType::BASIC, 1_MiB, true, false) in src/test/blockfilter_index_tests.cpp
  • InitBlockFilterIndex([&]{ return interfaces::MakeChain(m_node); }, BlockFilterType::BASIC, 1_MiB, true, false) in src/test/blockfilter_index_tests.cpp
  • InitBlockFilterIndex([&]{ return interfaces::MakeChain(m_node); }, BlockFilterType::BASIC, 1_MiB, true, false) in src/test/blockfilter_index_tests.cpp
  • CoinStatsIndex{interfaces::MakeChain(m_node), 1_MiB, true} in src/test/coinstatsindex_tests.cpp
  • CoinStatsIndex index{interfaces::MakeChain(m_node), 1_MiB} in src/test/coinstatsindex_tests.cpp
  • CoinStatsIndex index{interfaces::MakeChain(m_node), 1_MiB} in src/test/coinstatsindex_tests.cpp
  • TxIndex txindex(interfaces::MakeChain(m_node), 1_MiB, true) in src/test/txindex_tests.cpp

If you want, I can propose adding explanatory named comments (e.g. /cache_bytes=/) for these third-or-later literal arguments to improve readability.

2026-03-06 21:45:29

@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch 2 times, most recently from b616af9 to db3ada7 Compare January 28, 2026 21:33
@DrahtBot
Copy link
Contributor

🚧 At least one of the CI tasks failed.
Task ASan + LSan + UBSan + integer: https://github.com/bitcoin/bitcoin/actions/runs/21456037564/job/61797017546
LLM reason (✨ experimental): Compilation failed due to invalid user-defined literal operators in util/byte_units.h (invalid parameter type for operator""_MiB/_GiB leading to no matching literal operator).

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

  • Possibly due to a silent merge conflict (the changes in this pull request being
    incompatible with the current code in the target branch). If so, make sure to rebase on the latest
    commit of the target branch.

  • A sanitizer issue, which can only be found by compiling with the sanitizer and running the
    affected test.

  • An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

@l0rinc l0rinc closed this Jan 28, 2026
@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch from db3ada7 to 05d23f6 Compare January 28, 2026 21:51
@l0rinc l0rinc reopened this Jan 28, 2026
@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch from 05d23f6 to 0768e6b Compare January 28, 2026 22:17
Comment on lines -29 to +30
static inline int PruneMiBtoGB(int64_t mib) { return (mib * 1024 * 1024 + GB_BYTES - 1) / GB_BYTES; }
static inline int PruneMiBtoGB(int64_t mib) { return (mib * 1_MiB + GB_BYTES - 1) / GB_BYTES; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if multiplying with size_t instead of unspecified integer expression risks flipping negative function inputs into positive output values, but it's probably only used with positive inputs regardless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return prune_target_mib > 1 ? PruneMiBtoGB(prune_target_mib) : DEFAULT_PRUNE_TARGET_GB;
indicates this cannot be negative

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experimented a bit and it seems behavior hasn't changed. Passing test:

constexpr uint64_t GB_BYTES{1000000000};
static inline int PruneMiBtoGB_old(int64_t mib) { return (mib * 1024 * 1024 + GB_BYTES - 1) / GB_BYTES; }
static inline int PruneMiBtoGB_new(int64_t mib) { return (mib * 1_MiB + GB_BYTES - 1) / GB_BYTES; }
static inline int64_t PruneGBtoMiB_old(int gb) { return gb * GB_BYTES / 1024 / 1024; }
static inline int64_t PruneGBtoMiB_new(int gb) { return gb * GB_BYTES / 1_MiB; }

BOOST_AUTO_TEST_CASE(prune_tests)
{
    BOOST_CHECK_EQUAL(PruneMiBtoGB_old(1500), 2);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_new(1500), 2);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_old(-1500), 1266874889);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_new(-1500), 1266874889);

    BOOST_CHECK_EQUAL(PruneGBtoMiB_old(1), 953);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_new(1), 953);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_old(-1), 17592186043462);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_new(-1), 17592186043462);
}

@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch 2 times, most recently from dfe9038 to e5b573f Compare January 29, 2026 19:51
Copy link
Contributor

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK e5b573f

Makes the code more readible.

Examples:

-constexpr unsigned int FLTR_FILE_CHUNK_SIZE = 0x100000; // 1 MiB
+constexpr unsigned int FLTR_FILE_CHUNK_SIZE = 1_MiB;

-if (m_total_allocated > 0x1000000) return;
+if (m_total_allocated > 16_MiB) return;

-cuckoo_cache.setup_bytes(megabytes << 20);
+cuckoo_cache.setup_bytes(megabytes * 1_MiB);

-LogDebug(BCLog::COINDB, "Writing final batch of %.2f MiB\n", batch.ApproximateSize() * (1.0 / 1048576.0));
+LogDebug(BCLog::COINDB, "Writing final batch of %.2f MiB\n", batch.ApproximateSize() / double(1_MiB));

-constexpr uint64_t min_disk_space = 52428800; // 50 MiB
+constexpr uint64_t min_disk_space = 50_MiB;

Comment on lines -29 to +30
static inline int PruneMiBtoGB(int64_t mib) { return (mib * 1024 * 1024 + GB_BYTES - 1) / GB_BYTES; }
static inline int PruneMiBtoGB(int64_t mib) { return (mib * 1_MiB + GB_BYTES - 1) / GB_BYTES; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experimented a bit and it seems behavior hasn't changed. Passing test:

constexpr uint64_t GB_BYTES{1000000000};
static inline int PruneMiBtoGB_old(int64_t mib) { return (mib * 1024 * 1024 + GB_BYTES - 1) / GB_BYTES; }
static inline int PruneMiBtoGB_new(int64_t mib) { return (mib * 1_MiB + GB_BYTES - 1) / GB_BYTES; }
static inline int64_t PruneGBtoMiB_old(int gb) { return gb * GB_BYTES / 1024 / 1024; }
static inline int64_t PruneGBtoMiB_new(int gb) { return gb * GB_BYTES / 1_MiB; }

BOOST_AUTO_TEST_CASE(prune_tests)
{
    BOOST_CHECK_EQUAL(PruneMiBtoGB_old(1500), 2);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_new(1500), 2);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_old(-1500), 1266874889);
    BOOST_CHECK_EQUAL(PruneMiBtoGB_new(-1500), 1266874889);

    BOOST_CHECK_EQUAL(PruneGBtoMiB_old(1), 953);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_new(1), 953);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_old(-1), 17592186043462);
    BOOST_CHECK_EQUAL(PruneGBtoMiB_new(-1), 17592186043462);
}

Copy link
Contributor

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 792d679

Concpt. review: #34435 (review)

Checked trickier hex literals actually match _MiB/_GiB-versions.

Diffed against previously reviewed push and aside from commit re-ordering, only saw minimal deduplication in gib_string_literal_test.

@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch from 6091d91 to 2c35b07 Compare February 19, 2026 23:13
@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch 3 times, most recently from 6b5faad to 6069bc9 Compare February 20, 2026 10:23
Copy link
Contributor

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 6069bc9

Just changes to surrounding diff context and WARN_FLUSH_COINS_SIZE being replaced by constant no longer measured in bytes.

@l0rinc l0rinc closed this Feb 20, 2026
@l0rinc l0rinc reopened this Feb 20, 2026
@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch 2 times, most recently from 328ddc4 to c8e955c Compare February 21, 2026 08:26
Copy link
Contributor

@hodlinator hodlinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK c8e955c

(Was just rebased to kick CI).

constexpr auto max_gib{std::numeric_limits<size_t>::max() >> 30};
if constexpr (SIZE_MAX == UINT32_MAX) {
BOOST_CHECK_EQUAL(max_gib, 3U);
BOOST_CHECK_EXCEPTION(4_GiB, std::overflow_error, HasReason("GiB value too large for size_t byte conversion"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to add:

        BOOST_CHECK_EQUAL(4*uint64_t{1_GiB}, uint64_t{4} << 30);

An alternative would be to just make GiB return u64, and possibly MiB as well.

There shouldn't be any call-site that really needs size_t. size_t seems like a mental overhead, providing bugs such as #34109 and #33724

Copy link
Contributor Author

@l0rinc l0rinc Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, 32-bit architectures should also support this widening-first approach, so I added it to the test suite.

I dislike size_t too (for the exact reasons you mentioned), and I’d gladly review a PR that addresses that, but it seems more involved than this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About size_t being annoying, see also #34692 (comment) 😅

l0rinc added 2 commits March 6, 2026 21:42
Introduce `operator""_GiB`, sharing the overflow-checked conversion logic with the existing `operator""_MiB`.

Use `1_GiB` in a few existing places where it is a drop-in replacement (e.g. `1024_MiB`, `1<<30`) and extend unit tests to cover boundary behavior.
Replace hard-coded MiB byte conversions (e.g. `1024*1024`, `1<<20`, `1048576`) with the existing `_MiB` literal to improve readability and avoid repeating constants.

In the few spots where arithmetic involves signed values, the result is identical to the previous code assuming those quantities never turn negative.

Extend unit tests to cover the 32-bit `size_t` overflow boundary and to assert equivalence for integer and floating-point conversions.
@l0rinc l0rinc force-pushed the util/byte-units-mib-gib branch from 02184de to 85b9d0b Compare March 6, 2026 21:44
Copy link
Contributor

@sedited sedited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I'm not sure about the first commit, seems a bit noisy to do a conversion from 1024 MiB to 1 GiB for a developer who is probably aware of that fact since shortly after using a computer for the first time. For me at least it is also a good reminder that the unit of -dbcache is MiBs.

The second commit seems like a decent readability improvement though.

BOOST_CHECK_EQUAL(50_MiB / 1_MiB, 50U);
BOOST_CHECK_EQUAL(50_MiB, 52428800U);
BOOST_CHECK_EQUAL(128_MiB, 0x8000000U);
BOOST_CHECK_EQUAL(550_MiB, 550ULL * 1024 * 1024);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these really useful? Feels a bit like testing BOOST_CHECK_EQUAL(1+1,2).

Copy link
Contributor Author

@l0rinc l0rinc Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reviewers can decide. I covered every change category that I modified in this commit as proof that it doesn't change behavior. I often err on the side of over-testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants