Skip to content

osd: fix for "data digests are inconsistent"#65623

Merged
SrinivasaBharath merged 2 commits intoceph:mainfrom
JonBailey1993:data_digests_are_inconsistent_fix
Feb 5, 2026
Merged

osd: fix for "data digests are inconsistent"#65623
SrinivasaBharath merged 2 commits intoceph:mainfrom
JonBailey1993:data_digests_are_inconsistent_fix

Conversation

@JonBailey1993
Copy link
Contributor

Fix for "data digests are inconsistent"

It was possible to see "data digests are inconsistent" being output to the logs at incorrect times due to multiple bugs. This code reorganises some of the deep scrubbing code and fixes the issues. The root cause of the issue that is being fixed here is:

  • We were comparing crc buffers beyond the end of the crcs
  • There was a double call to logical_to_ondisk_size when creating the crcs for zero buffers, causing them to be mis-sized
  • The code was not padding smaller shards as its a requirement for shards to be the same sized when used for parity comparison.

All the above are fixed in this commit.

Tracker ticket: https://tracker.ceph.com/issues/72945

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

@JonBailey1993 JonBailey1993 requested a review from a team as a code owner September 22, 2025 13:01
@github-actions github-actions bot added the core label Sep 22, 2025
@JonBailey1993 JonBailey1993 force-pushed the data_digests_are_inconsistent_fix branch from 4c0c4f0 to e2b3571 Compare September 22, 2025 15:29
@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@JonBailey1993 JonBailey1993 marked this pull request as draft September 22, 2025 15:41
@JonBailey1993 JonBailey1993 force-pushed the data_digests_are_inconsistent_fix branch from e2b3571 to 8b5e9e6 Compare September 22, 2025 17:05
@JonBailey1993 JonBailey1993 marked this pull request as ready for review September 22, 2025 17:06
@JonBailey1993 JonBailey1993 force-pushed the data_digests_are_inconsistent_fix branch from 8b5e9e6 to c143b5f Compare September 22, 2025 17:15
@JonBailey1993
Copy link
Contributor Author

Make checks timed out. Will retrigger

@JonBailey1993
Copy link
Contributor Author

jenkins test make check

@JonBailey1993
Copy link
Contributor Author

jenkins test make check arm64

@ronen-fr
Copy link
Contributor

(I hope to complete the review on my next work day - Sunday)

const hobject_t& ho) {
this_chunk->m_ec_digest_map.clear();

if (auth_selection.auth_oi.version != eversion_t() && !m_is_replicated &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to repeat all the checks for m_is_replicated in a function names setup_ec_digest_map? Perhaps asserting once would be enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my suggested changes in #65497
introduced 'm_ec_digest_map_size' that saved us from having to repeat the check.
Might be usable here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can assert that m_is_replicated is false at the start instead and move the check to outside the function, that would be more logical. The m_ec_digest_map_size does look like it will improve things here in the future too.

Copy link
Contributor

@ronen-fr ronen-fr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming comments (esp. Radek's) are addressed.

@JonBailey1993 JonBailey1993 force-pushed the data_digests_are_inconsistent_fix branch from c143b5f to cbfe3da Compare October 6, 2025 15:27
@JonBailey1993
Copy link
Contributor Author

All comments should now be addressed/responded to

@JonBailey1993
Copy link
Contributor Author

jenkins test make check arm64

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

std::string extract_crcs_from_map(const shard_id_map<bufferlist>& map);
std::string extract_crc_from_bufferlist(const bufferlist& crc_buffer);
char retrieve_byte(uint32_t value, uint32_t index) {
return value >> (8 * index) & 0xFF;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, it's surprising we don't such a primitive in include/intarith.h.

clog.error() << candidates_errors.str();
}

if (!m_is_replicated) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK.


for (uint32_t i = 0; i < sizeof(zero_data_crc); i++) {
bl.c_str()[i] = bl[i] ^ ((zero_data_crc >> (8 * i)) & 0xff);
bl.c_str()[i] ^= retrieve_byte(zero_data_crc, i);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way more readable.

shard_id_set available_shards;

for (const auto& [srd, smap] : this_chunk->received_maps) {
if (!m_is_replicated && m_pg.get_ec_supports_crc_encode_decode() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: there is no (shouldn't be) way m_is_replicated can change here.


void ScrubBackend::setup_ec_digest_map(auth_selection_t& auth_selection,
const hobject_t& ho) {
ceph_assert(!m_is_replicated);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK.

Copy link
Member

@ljflores ljflores left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @JonBailey1993, the teuthology tests still turned up some instances of this error. I ensured the commit was present in these tests. Can you take a look?

/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543999
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543947
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543988
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8544022
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8544087
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543867
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543939
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8543942
/a/skanta-2025-10-09_23:38:36-rados-wip-bharath7-testing-2025-10-09-2128-distro-default-smithi/8544099

Testing ref: https://tracker.ceph.com/issues/73477

@JonBailey1993
Copy link
Contributor Author

Hey Laura, I'll take a look at this and see what has gone wrong

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@rzarzynski
Copy link
Contributor

@JonBailey1993: it also needs a rebase.

@JonBailey1993
Copy link
Contributor Author

I will do the rebase with my fixes, thank you @rzarzynski

It was possible to see "data digests are inconsistent" being output to the logs at incorrect times due to multiple bugs. This code reorganises some of the deep scrubbing code and fixes the issues. The root cause of the issue that is being fixed here is:
* We were comparing crc buffers beyond the end of the crcs
* There was a double call to logical_to_ondisk_size when creating the crcs for zero buffers, causing them to be mis-sized
* The code was not padding smaller shards as its a requirement for shards to be the same sized when used for parity comparison.

All the above are fixed in this commit

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
…sistency of erasure coded pools

Signed-off-by: Jon Bailey <jonathan.bailey1@ibm.com>
@JonBailey1993 JonBailey1993 force-pushed the data_digests_are_inconsistent_fix branch from cbfe3da to 20623d6 Compare November 14, 2025 11:37
@JonBailey1993
Copy link
Contributor Author

The second issue appeared to be the changes were not guarded against running on shallow scrub. I've run some tests on teuthology and no longer see this error erroneously occurring as before. I've rebased on the latest now as well.

@JonBailey1993
Copy link
Contributor Author

jenkins test make check

1 similar comment
@JonBailey1993
Copy link
Contributor Author

jenkins test make check

@ljflores
Copy link
Member

ljflores commented Dec 2, 2025

Apologies for the delay on approval; I found an issue with a separate PR in the batch. We're rerunning tests and should have fresh results soon.

@github-actions
Copy link

github-actions bot commented Feb 1, 2026

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Feb 1, 2026
@JonBailey1993
Copy link
Contributor Author

Not stale. Currently in QA.

@ronen-fr ronen-fr removed the stale label Feb 2, 2026
@JonBailey1993
Copy link
Contributor Author

@SrinivasaBharath SrinivasaBharath merged commit dfa42f1 into ceph:main Feb 5, 2026
13 checks passed
@github-actions
Copy link

github-actions bot commented Feb 5, 2026

This is an automated message by src/script/redmine-upkeep.py.

I have resolved the following tracker ticket due to the merge of this PR:

No backports are pending for the ticket. If this is incorrect, please update the tracker
ticket and reset to Pending Backport state.

Update Log: https://github.com/ceph/ceph/actions/runs/21731423168

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants