Skip to content

os/bluestore: write_v2 resharding misplaced#61990

Merged
aclamk merged 5 commits intoceph:mainfrom
aclamk:wip-aclamk-write-v2-dirty-reshard
Mar 17, 2025
Merged

os/bluestore: write_v2 resharding misplaced#61990
aclamk merged 5 commits intoceph:mainfrom
aclamk:wip-aclamk-write-v2-dirty-reshard

Conversation

@aclamk
Copy link
Contributor

@aclamk aclamk commented Feb 25, 2025

We should mark dirty and signal for reshard region before we trim the data by _try_reuse_allocated.
Punching hole and writing to already existing blobs is still modification.

The problem was detected using:
./bin/ceph_test_objectstore --log-to-stderr=false --bluestore_write_v2=true --plugin_dir=./lib --gtest_filter=\*SyntheticMatrixShardingLimited\*

Problem was introduced in #61762

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@aclamk aclamk requested a review from a team as a code owner February 25, 2025 10:06
@aclamk aclamk requested review from liu-chunmei and removed request for a team February 25, 2025 10:06
@aclamk
Copy link
Contributor Author

aclamk commented Feb 25, 2025

jenkins test make check

@liu-chunmei
Copy link
Contributor

liu-chunmei commented Feb 25, 2025

@aclamk the functions calling sequence is not changed, before it calls wr.do_write, then dirty_range, then maybe_reshard, 61762 just move dirty_range and maybe_reshard at the end of wr.do_write in order to use aligned offset, the only difference is using aligned offset and length. so, address alignment cause something wrong?

I have verified, same place (in do_write_v2 call dirty_range and maybe_reshard) but use original offset and length, then it works. and only dirty_range need original offset and length, maybe_reshard can use aligned location and data_end. I don't know why dirty_range cause this test case crush by aligned address.

got it, location and data_end adjsuted are smaller than original offset~length, should add try_reuse_allocated part.

@github-actions
Copy link

github-actions bot commented Mar 4, 2025

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 4d6ddd8 to 04d4baa Compare March 5, 2025 10:08
@aclamk aclamk added the DNM label Mar 5, 2025
@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 04d4baa to 0a49a7c Compare March 5, 2025 12:14
@aclamk aclamk removed the DNM label Mar 5, 2025
@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 0a49a7c to 69e7c15 Compare March 5, 2025 16:23
o->extent_map.dirty_range(offset, length);
o->extent_map.maybe_reshard(offset, offset + length);
o->extent_map.dirty_range(wr.left_shard_bound, wr.right_shard_bound - wr.left_shard_bound);
o->extent_map.maybe_reshard(wr.left_shard_bound, wr.right_shard_bound);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this fix to second commit.

@liu-chunmei
Copy link
Contributor

can we rearrange the commits into 3? the first one: fault_range_ex include determine the left/right bound and dirty_range reshard input parameters modification, also the place moved from Writer.cc to Bluestore.cc. The second one : remove extentmap compress, include the extent merge calculation, the third one: calculate the left/right affect bound and modify dirty_range and reshard input parameters.

@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 69e7c15 to 966e700 Compare March 6, 2025 20:58
emap.insert(*le);
}
bstore->logger->inc(l_bluestore_write_small);
bstore->logger->inc(l_bluestore_write_small_bytes, ref_end - logical_offset);
Copy link
Contributor

@liu-chunmei liu-chunmei Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we check left, we also need check right, for example, there is an extent [4K, 60K], when we write[9K,2K], punch_hole will make extent looks like [4K, 4K] [12K, 52K], we will inset [8K, 4K], we only combined with left make it [4K, 8K], but right one is [12K,52K], we don't combine with right extent. I think we can reuse extent_map.compress_extent_map by left_affect_range and right_affect_range that logic is completely and make code easily readable.

@aclamk aclamk requested a review from liu-chunmei March 11, 2025 13:48
@aclamk aclamk added the aclamk-testing-phoebe bluestore testing label Mar 11, 2025
Add fault_range_ex, a sibling to fault_range.
The new function does exactly the same as the original.
The difference is the returned range <start, end> that reflects
range encompassed by shards affected.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 4305224 to 34574a4 Compare March 11, 2025 18:56
@liu-chunmei
Copy link
Contributor

liu-chunmei commented Mar 11, 2025

analysis for performance:

image

image


aclamk added 4 commits March 13, 2025 06:49
Remove need to call compress_extents.
The only place that could produce compressable extents are
_try_reuse_allocated_[l/r] functions.

Create _place_extent_in_blob() function that adds mapping to blob.
If possible, it expands the extent; if not, creates the new one.
The function is used whenever we need new mapping to an existing Blob.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
This reduces size for dirty_range and need_reshard range.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
There was possible case that affected rightmost changed extent.
It could be that 2 mergeable extents were next to each other.
This change melds them together.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
It could happen that target blob already had empty mapping at the end.
In such case, do not add more empty mappings, just expand exising one.

Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
@aclamk aclamk force-pushed the wip-aclamk-write-v2-dirty-reshard branch from 34574a4 to 71e094f Compare March 13, 2025 11:55
@aclamk
Copy link
Contributor Author

aclamk commented Mar 17, 2025

Passed QA at https://tracker.ceph.com/issues/70440 .

@aclamk aclamk merged commit 8cacc72 into ceph:main Mar 17, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants