Skip to content

cephfs: add support for cephfs file block diff api#1191

Merged
mergify[bot] merged 1 commit into
ceph:masterfrom
iPraveenParihar:cephfs-blockdiff
Jan 16, 2026
Merged

cephfs: add support for cephfs file block diff api#1191
mergify[bot] merged 1 commit into
ceph:masterfrom
iPraveenParihar:cephfs-blockdiff

Conversation

@iPraveenParihar

Copy link
Copy Markdown
Contributor

cephfs: add support for cephfs file block diff api

This commit adds support for cephfs file block diff
APIs which gives a list of changed blocks with in a
file between two snapshots.

Checklist

  • Added tests for features and functional changes
  • Public functions and types are documented
  • Standard formatting is applied to Go code
  • Is this a new API? Added a new file that begins with //go:build ceph_preview
  • Ran make api-update to record new APIs

New or infrequent contributors may want to review the go-ceph Developer's Guide including the section on how we track API Status and the API Stability Plan.

The go-ceph project uses mergify. View the mergify command guide for information on how to interact with mergify. Add a comment with @Mergifyio rebase to rebase your PR when github indicates that the PR is out of date with the base branch.

@mergify

mergify Bot commented Oct 27, 2025

Copy link
Copy Markdown
Contributor

This pull request now has conflicts with the target branch.
Could you please resolve conflicts and force push the corrected
changes? 🙏

@iPraveenParihar iPraveenParihar marked this pull request as ready for review October 27, 2025 14:01
@iPraveenParihar

Copy link
Copy Markdown
Contributor Author

@anoopcs9 PTAL

Comment thread cephfs/block_diff.go Outdated
// struct ceph_file_blockdiff_changedblocks* blocks);
//
// void ceph_free_file_blockdiff_buffer(struct ceph_file_blockdiff_changedblocks* blocks);
func (info *FileBlockDiffInfo) ReadFileBlockDiff() (bool, *FileBlockDiffChangedBlocks, error) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any case where the bool is true and the pointer is nil? If not the bool seems a bit redundant. Is there something I'm overlooking?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phlogistonjohn, no we don't have such a case.
bool will be true if there are more changed blocks entries and is such a case the pointer will never be nil.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, it seems like we can then simply the return types by dropping the boolean and having the callers test for nil/non-nil in all the cases that were previously using the boolean.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pointer points to the current set of changed blocks (nil on error), while the boolean clearly indicates if more entries remain. Without the boolean, callers would need to infer that from the pointer and error, which makes the flow less clear.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, it seems like we can then simply the return types by dropping the boolean and having the callers test for nil/non-nil in all the cases that were previously using the boolean.

The boolean is used by cephfs function to indicate if the user needs to call ReadFileBlockDiff again for more entries.
We should preserve the behavior intended by cephfs team here.

@vshankar had pointed to this test case in ceph repo so that I could understand the intended usage.
Hope it helps clear the confusion.

https://github.com/ceph/ceph/blob/060b69c6605d88ad06bd1a3b81697abff2bd4ca5/src/test/libcephfs/snapdiff.cc#L410-L430

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I'm even more confused. There's no boolean in the api we're wrapping it returns ret (an int) and info (input) and blocks (output) arguments.
No bool there at all.

Without the boolean, callers would need to infer that from the pointer and error, which makes the flow less clear

This doesn't seem right to me. Error indicates error, period. At that point the loop should stop and handle the error. If error is not nil the other return value(s) are irrelevant. If there is no error then the client can check the nil-ity of the blocks value or not to determine if the loop needs to continue or not. The extra bool value is redundant.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ret(an int) indicated by r is used in the following manner:

  • r > 0 : there will be more changed blocks returned in the next call
  • r == 0: no more changed blocks to be returned
  • r < 0: there's an errorr

It is confusing but that's how the return value of the function ceph_file_blockdiff is meant to be used.

The following is testcase for this function from https://github.com/ceph/ceph/blob/060b69c6605d88ad06bd1a3b81697abff2bd4ca5/src/test/libcephfs/snapdiff.cc#L410-L430

    r = 1;
    while (r > 0) {
      ceph_file_blockdiff_changedblocks blocks;
      r = ceph_file_blockdiff(&info, &blocks);
      if (r < 0) {
	std::cerr << " Failed to get next changed block, ret:" << r << std::endl;
	return r;
      }

      int nr_blocks = blocks.num_blocks;
      struct cblock *b = blocks.b;
      while (nr_blocks > 0) {
	std::cout << " == [" << b->offset << "~" << b->len << "] == " << std::endl;
	if (expected) {
	  expected->erase(b->offset, b->len);
	}
	++b;
	--nr_blocks;
      }

      ceph_free_file_blockdiff_buffer(&blocks);
    }

In go-ceph implementation to make it simpler,
a boolean is used to indicate if the next function call needs to be made or not.

	if ret < 0 {
		return false, nil, getError(ret)
	}
	// if ret == 0 indicates there is no more entries after this call.
	noMoreEntries = (ret == 0)

Again,

We should preserve the behavior intended by cephfs team here.

Making any change to the behavior at this go-ceph binding layer may lead to unexpected behavior later on.

@anoopcs9 anoopcs9 Nov 13, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if we go for an extra round I hope we’ll discover that num_blocks is 0 and/or b is NULL which I guess is what John was hinting at.

just another thought: Can we think of a More() method to basically return whether more processing is required or not? Let's say we have a boolean noMoreEntries inside FileBlockDiffInfo which is set to true during init and subsequently updated during read based on the return code for ceph_file_blockdiff() inside ReadFileBlockDiff().

Comment thread cephfs/block_diff_test.go Outdated
Comment thread cephfs/block_diff_test.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
// struct ceph_file_blockdiff_changedblocks* blocks);
//
// void ceph_free_file_blockdiff_buffer(struct ceph_file_blockdiff_changedblocks* blocks);
func (info *FileBlockDiffInfo) ReadFileBlockDiff() (bool, *FileBlockDiffChangedBlocks, error) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, I'm even more confused. There's no boolean in the api we're wrapping it returns ret (an int) and info (input) and blocks (output) arguments.
No bool there at all.

Without the boolean, callers would need to infer that from the pointer and error, which makes the flow less clear

This doesn't seem right to me. Error indicates error, period. At that point the loop should stop and handle the error. If error is not nil the other return value(s) are irrelevant. If there is no error then the client can check the nil-ity of the blocks value or not to determine if the loop needs to continue or not. The extra bool value is redundant.

Comment thread cephfs/block_diff.go Outdated
@iPraveenParihar iPraveenParihar force-pushed the cephfs-blockdiff branch 3 times, most recently from 47aa6ad to 8120dc3 Compare November 11, 2025 06:12
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff.go Outdated
@iPraveenParihar

Copy link
Copy Markdown
Contributor Author

@nixpanic @anoopcs9 PTAL.
Thanks! 😄

@iPraveenParihar iPraveenParihar force-pushed the cephfs-blockdiff branch 2 times, most recently from 1fe9669 to 9a3b95f Compare November 13, 2025 05:42
@iPraveenParihar

Copy link
Copy Markdown
Contributor Author

rebased the PR now.
Could I please have a review again?

Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff.go Outdated
// struct ceph_file_blockdiff_changedblocks* blocks);
//
// void ceph_free_file_blockdiff_buffer(struct ceph_file_blockdiff_changedblocks* blocks);
func (info *FileBlockDiffInfo) ReadFileBlockDiff() (bool, *FileBlockDiffChangedBlocks, error) {

@anoopcs9 anoopcs9 Nov 13, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if we go for an extra round I hope we’ll discover that num_blocks is 0 and/or b is NULL which I guess is what John was hinting at.

just another thought: Can we think of a More() method to basically return whether more processing is required or not? Let's say we have a boolean noMoreEntries inside FileBlockDiffInfo which is set to true during init and subsequently updated during read based on the return code for ceph_file_blockdiff() inside ReadFileBlockDiff().

@iPraveenParihar iPraveenParihar force-pushed the cephfs-blockdiff branch 2 times, most recently from 966d203 to 6d97db5 Compare November 14, 2025 08:17
@iPraveenParihar

Copy link
Copy Markdown
Contributor Author

@anoopcs9, addressed your reviews.
PTAL. Thanks!

Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff_test.go Outdated
Comment thread cephfs/block_diff_test.go Outdated
Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go Outdated
@iPraveenParihar iPraveenParihar force-pushed the cephfs-blockdiff branch 2 times, most recently from 242586b to 7585456 Compare December 15, 2025 10:34
@anoopcs9

anoopcs9 commented Jan 6, 2026

Copy link
Copy Markdown
Collaborator

@Mergifyio rebase

@mergify

mergify Bot commented Jan 6, 2026

Copy link
Copy Markdown
Contributor

rebase

✅ Branch has been successfully rebased

Comment thread cephfs/block_diff.go Outdated
Comment thread cephfs/block_diff.go
Comment thread cephfs/block_diff.go
@mergify

mergify Bot commented Jan 9, 2026

Copy link
Copy Markdown
Contributor

This pull request now has conflicts with the target branch.
Could you please resolve conflicts and force push the corrected
changes? 🙏

@anoopcs9 anoopcs9 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, your latest update seems to have overwritten the space/tab alignment fixes I had pushed earlier. Please make sure that you fetch the latest version from PR before making any changes.

Comment thread cephfs/block_diff.go
@iPraveenParihar iPraveenParihar force-pushed the cephfs-blockdiff branch 2 times, most recently from 7402a21 to faaf936 Compare January 13, 2026 10:14
@anoopcs9 anoopcs9 force-pushed the cephfs-blockdiff branch 2 times, most recently from 0c6bd7f to 289b473 Compare January 13, 2026 10:54

@anoopcs9 anoopcs9 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve added a comment about the extra care needed for struct ceph_file_blockdiff_result.

Thanks for your patience over the past months, it’s really appreciated.

@anoopcs9 anoopcs9 added the API This PR includes a change to the public API of a go-ceph package label Jan 13, 2026
@phlogistonjohn

Copy link
Copy Markdown
Collaborator

@Mergifyio rebase

@mergify

mergify Bot commented Jan 13, 2026

Copy link
Copy Markdown
Contributor

rebase

✅ Branch has been successfully rebased

@anoopcs9

Copy link
Copy Markdown
Collaborator

@Mergifyio rebase

This commit adds support for cephfs file block diff APIs which gives a
list of changed blocks with in a file between two snapshots.

Co-authored-by: Praveen M <m.praveen@ibm.com>
Signed-off-by: Rakshith R <rar@redhat.com>
@mergify

mergify Bot commented Jan 14, 2026

Copy link
Copy Markdown
Contributor

rebase

✅ Branch has been successfully rebased

@phlogistonjohn phlogistonjohn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks

@mergify mergify Bot added the queued label Jan 16, 2026
@mergify mergify Bot merged commit 9343cc2 into ceph:master Jan 16, 2026
17 checks passed
@mergify

mergify Bot commented Jan 16, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

✅ The pull request has been merged at 21934d9

This pull request spent 6 seconds in the queue, with no time running CI.
The checks were run in-place.

Required conditions to merge
  • #approved-reviews-by >= 1 [🛡 GitHub branch protection]
  • #changes-requested-reviews-by = 0 [🛡 GitHub branch protection]
  • any of [🛡 GitHub branch protection]:
    • check-success = check
    • check-neutral = check
    • check-skipped = check
  • any of [🛡 GitHub branch protection]:
    • check-success = test-suite (pacific)
    • check-neutral = test-suite (pacific)
    • check-skipped = test-suite (pacific)
  • any of [🛡 GitHub branch protection]:
    • check-success = test-suite (quincy)
    • check-neutral = test-suite (quincy)
    • check-skipped = test-suite (quincy)
  • any of [🛡 GitHub branch protection]:
    • check-success = test-suite (reef)
    • check-neutral = test-suite (reef)
    • check-skipped = test-suite (reef)
  • any of [🛡 GitHub branch protection]:
    • check-success = test-suite (squid)
    • check-neutral = test-suite (squid)
    • check-skipped = test-suite (squid)

@mergify mergify Bot removed the queued label Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API This PR includes a change to the public API of a go-ceph package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants