Skip to content

rbd: fix DiffIterateByID Length calculation and add callback tests#6200

Merged
mergify[bot] merged 1 commit into
ceph:develfrom
kaovilai:test-snap-diff-callback-coverage
Apr 1, 2026
Merged

rbd: fix DiffIterateByID Length calculation and add callback tests#6200
mergify[bot] merged 1 commit into
ceph:develfrom
kaovilai:test-snap-diff-callback-coverage

Conversation

@kaovilai

Copy link
Copy Markdown
Contributor

Summary

  • Fix: DiffIterateByIDConfig.Length was set to VolSize (total volume size), but rbd_diff_iterate3 treats len as bytes to scan from ofs. When startingOffset > 0, this scanned [startingOffset, startingOffset + VolSize) instead of [startingOffset, VolSize). Fixed by using VolSize - startingOffset.
  • Tests: Add 7 unit tests for createDiffIterateByIDCB covering block collection, LUKS padding adjustment, batching, error propagation, context cancellation, and edge cases.

Note

This project requires CGo (Ceph C libraries librbd.h, librados.h) to compile. The tests cannot be validated on a standard dev machine without these libraries — relying on CI to verify.

Test plan

  • CI passes go test ./internal/rbd/ -run Test_createDiffIterateByIDCB
  • CI passes go test ./internal/rbd/ -run Test_handleDiffIterateError
  • Existing tests unaffected

Ref: kaovilai/cephcsi-cbt-e2e#2

🤖 Generated with Claude Code
via Happy

@kaovilai kaovilai force-pushed the test-snap-diff-callback-coverage branch from a491036 to 41d995b Compare March 24, 2026 22:44
@mergify mergify Bot added component/rbd Issues related to RBD bug Something isn't working labels Mar 24, 2026
@kaovilai

Copy link
Copy Markdown
Contributor Author

Note

Responses generated with Claude

Local Test Results

All tests pass when run in the project's test container:

podman run --rm -v "$(pwd):/go/src/github.com/ceph/ceph-csi" quay.io/cephcsi/cephcsi:test \
  go test -tags='ceph_preview' -mod=vendor -v \
  -run 'Test_createDiffIterateByIDCB|Test_handleDiffIterateError' ./internal/rbd/
--- PASS: Test_createDiffIterateByIDCB (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/collects_single_block_without_LUKS_padding (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/returns_Unknown_code_on_sendResponse_error (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/batches_at_maxResults_and_resets_slice (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/returns_Canceled_on_context_cancellation (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/adjusts_offset_for_LUKS_header_padding (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/zero_blocks_produces_no_batches (0.00s)
    --- PASS: Test_createDiffIterateByIDCB/multiple_batches_with_remainder (0.00s)
--- PASS: Test_handleDiffIterateError (0.00s)
    --- PASS: Test_handleDiffIterateError/nil_error (0.00s)
    --- PASS: Test_handleDiffIterateError/response_send_failure (0.00s)
    --- PASS: Test_handleDiffIterateError/unrecognized_error_code (0.00s)
    --- PASS: Test_handleDiffIterateError/ok_error_code (0.00s)
    --- PASS: Test_handleDiffIterateError/non-ErrorCode_error (0.00s)
    --- PASS: Test_handleDiffIterateError/stream_closed (0.00s)
PASS
ok  	github.com/ceph/ceph-csi/internal/rbd	0.054s

Note: Building the test container requires GOARCH=arm64 on Apple Silicon Macs — the Makefile defaults to GOARCH from go env which may not be set if Go isn't installed locally.

kaovilai added a commit to kaovilai/go-ceph that referenced this pull request Mar 24, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
@kaovilai kaovilai marked this pull request as ready for review March 24, 2026 22:59
Copilot AI review requested due to automatic review settings March 24, 2026 22:59

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes snapshot diff iteration to scan the correct byte range when a non-zero starting offset is used, and adds unit tests around the diff-iterate callback behavior used for metadata streaming.

Changes:

  • Fix DiffIterateByIDConfig.Length to be VolSize - startingOffset (bytes remaining from the starting offset), avoiding scanning beyond the volume end.
  • Add unit tests for createDiffIterateByIDCB covering batching, LUKS padding adjustment, cancellation, and error propagation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
internal/rbd/snap_diff.go Corrects diff-iterate scan length to reflect remaining bytes from the requested offset.
internal/rbd/snap_diff_test.go Adds callback-focused unit tests for batching/offset adjustment/cancellation/error paths.

Comment thread internal/rbd/snap_diff.go
Comment on lines 99 to 102
diffIterateByIDConfig := librbd.DiffIterateByIDConfig{
Offset: uint64(startingOffset),
Length: uint64(rbdSnap.VolSize),
Length: uint64(rbdSnap.VolSize) - uint64(startingOffset),
Callback: cb,

Copilot AI Mar 24, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new Length calculation fixes the scan range, but the behavior isn’t covered by tests: Test_createDiffIterateByIDCB only validates the callback, and Test_handleDiffIterateError only validates error mapping. Consider adding a small unit test that asserts the computed (Offset, Length) range for a non-zero startingOffset (and with/without LUKS padding). If mocking image.DiffIterateByID is hard, extracting the range computation into a helper would make this directly testable.

Copilot uses AI. Check for mistakes.
@nixpanic

Copy link
Copy Markdown
Member

Looks reasonable to me. Please correct the commit message formatting a little, just use Assisted-by with the name+email of the AI agent. References to issues should be to issues in a project repository, not issues in a personal fork of the repository.

@nixpanic

Copy link
Copy Markdown
Member

golangci-lint fails on internal/rbd/snap_diff_test.go, it would be good to get that corrected. Thanks!

@kaovilai

Copy link
Copy Markdown
Contributor Author

noted

@kaovilai kaovilai force-pushed the test-snap-diff-callback-coverage branch 2 times, most recently from f307ff5 to 6bb479a Compare March 27, 2026 12:28
@kaovilai

Copy link
Copy Markdown
Contributor Author

lint fixed

@kaovilai

Copy link
Copy Markdown
Contributor Author

Created #6208 to assist

@kaovilai kaovilai force-pushed the test-snap-diff-callback-coverage branch 2 times, most recently from 4b8a3a7 to 5f86f08 Compare March 27, 2026 13:15
kaovilai added a commit to kaovilai/go-ceph that referenced this pull request Mar 28, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>

@nixpanic nixpanic left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rakshith-R , backport this to the current release?

@Rakshith-R

Copy link
Copy Markdown
Contributor

@Rakshith-R , backport this to the current release?

The feature is still in alpha in current release, I think we skip back-porting this fix.

@kaovilai Can you please check why your commit shows as un-verified on github ?
The changes look good to me otherwise.

@kaovilai

kaovilai commented Mar 30, 2026

Copy link
Copy Markdown
Contributor Author

Because I didn't use the same commit email as the one I have gpg sign setup probably

@kaovilai

Copy link
Copy Markdown
Contributor Author

I can make it verified by fixing committee email to match gpg key. Never found any repos requiring gpg signing before but I guess unverified badge can be spooky. It only happens if it does gpg signing but not using matching committer email.

@nixpanic

nixpanic commented Apr 1, 2026

Copy link
Copy Markdown
Member

I can make it verified by fixing committee email to match gpg key. Never found any repos requiring gpg signing before but I guess unverified badge can be spooky. It only happens if it does gpg signing but not using matching committer email.

I do not think it is required to address for this PR. It probably is something you want to fix for future contributions.

@Madhu-1 Madhu-1 added the ci/skip/e2e skip running e2e CI jobs label Apr 1, 2026
@mergify

mergify Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

Merge Queue Status

  • Entered queue2026-04-01 09:32 UTC · Rule: default
  • Checks passed · in-place
  • Merged2026-04-01 10:09 UTC · at 6717dbdf70bc28ed76904688222a88d39cfbed09

This pull request spent 37 minutes 15 seconds in the queue, including 36 minutes 54 seconds running CI.

Required conditions to merge

Fix Length calculation in DiffIterateByID to exclude startingOffset,
and add unit tests for createDiffIterateByIDCB and handleDiffIterateError.

Assisted-by: Claude (Anthropic) <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
@ceph-csi-bot ceph-csi-bot force-pushed the test-snap-diff-callback-coverage branch from 5f86f08 to 6717dbd Compare April 1, 2026 09:32
@Madhu-1

Madhu-1 commented Apr 1, 2026

Copy link
Copy Markdown
Collaborator

@Mergifyio rebase

@mergify mergify Bot added the queued label Apr 1, 2026
@mergify

mergify Bot commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

rebase

☑️ Nothing to do, the required conditions are not met

Details
  • queue-position = -1 [📌 rebase requirement]
  • any of:
    • #commits-behind > 0 [📌 rebase requirement]
    • -linear-history [📌 rebase requirement]
  • -closed [📌 rebase requirement]
  • -conflict [📌 rebase requirement]

@mergify mergify Bot merged commit 2b92dc1 into ceph:devel Apr 1, 2026
19 checks passed
@mergify mergify Bot removed the queued label Apr 1, 2026
@kaovilai

kaovilai commented Apr 1, 2026

Copy link
Copy Markdown
Contributor Author

@nixpanic fixed now. it's honestly just a server side UI thing.

@kaovilai

kaovilai commented Apr 1, 2026

Copy link
Copy Markdown
Contributor Author

See that prior committed commits now show verified.

kaovilai added a commit to kaovilai/go-ceph that referenced this pull request Apr 16, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
kaovilai added a commit to kaovilai/go-ceph that referenced this pull request Apr 16, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
kaovilai added a commit to kaovilai/go-ceph that referenced this pull request Apr 16, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
mergify Bot pushed a commit to ceph/go-ceph that referenced this pull request Apr 17, 2026
…fset tests

The Offset and Length fields in DiffIterateConfig and DiffIterateByIDConfig
lacked documentation explaining that Length is the number of bytes to scan
from Offset (range [Offset, Offset+Length)), not an absolute end position.
This ambiguity contributed to a bug in ceph-csi (PR #6200) where Length was
set to the full volume size instead of (volumeSize - offset), causing zero
blocks to be returned when using a non-zero offset (e.g. LUKS header skip).

Add doc comments clarifying the semantics and add tests exercising non-zero
Offset values for both DiffIterate and DiffIterateByID, verifying that only
extents within the [Offset, Offset+Length) range are reported.

See: ceph/ceph-csi#6200
See: kaovilai/cephcsi-cbt-e2e#2

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ci/skip/e2e skip running e2e CI jobs component/rbd Issues related to RBD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants