Skip to content

rgw/bucket-logging: support EC pools#65900

Merged
yuvalif merged 2 commits intoceph:mainfrom
nbalacha:wip-nbalacha-71365
Dec 2, 2025
Merged

rgw/bucket-logging: support EC pools#65900
yuvalif merged 2 commits intoceph:mainfrom
nbalacha:wip-nbalacha-71365

Conversation

@nbalacha
Copy link
Copy Markdown
Contributor

@nbalacha nbalacha commented Oct 12, 2025

Log buckets can now be created within erasure-coded (EC) pools.
To support append operations, a temporary log record object is initially
created in the replicated default.rgw.log pool. This object is then copied
to the EC pool upon log record commitment.
All implicit log commit operations will execute asynchronously. A new
BucketLoggingManager class is responsible for processing these pending
commits at set intervals. Explicit commit operations, however, will
continue to be performed synchronously.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands

You must only issue one Jenkins command per-comment. Jenkins does not understand
comments with more than one command.

Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
@nbalacha nbalacha requested review from a team and yuvalif October 13, 2025 03:12
Comment thread src/rgw/driver/rados/rgw_bucketlogging.cc Outdated
@nbalacha
Copy link
Copy Markdown
Contributor Author

@yuvalif , please take a look at the approach and let me know what you think.
This is not ready for review.

Pending tasks:

  1. Check if the log bucket is on a EC pool and use that information to determine where to create the log record object.
  2. Cleanup the bucket logging list objects when logging is disabled for a bucket
  3. Remove the debug "NITHYA" log messages.

@nbalacha
Copy link
Copy Markdown
Contributor Author

Pending tasks:

  1. Code refactoring and cleanup (file names, logs, namespaces)
  2. EC pool tests
  3. Additional error handling in the BucketLoggingManager

@nbalacha nbalacha force-pushed the wip-nbalacha-71365 branch 2 times, most recently from 9192f62 to 44c07a6 Compare October 31, 2025 15:34
@anthonyeleven
Copy link
Copy Markdown
Contributor

Please select a line under the Tracker section of the checklist.
If you rebase the RTD failure should resolve.

@nbalacha nbalacha force-pushed the wip-nbalacha-71365 branch 3 times, most recently from 01997e0 to defee46 Compare November 4, 2025 08:55
Comment thread src/rgw/rgw_bucket_logging.cc
Comment thread src/rgw/rgw_bucket_logging.cc Outdated
Comment thread src/rgw/rgw_sal.h
Comment thread src/rgw/rgw_sal.h Outdated
Comment thread src/rgw/rgw_zone.cc Outdated
Comment thread src/rgw/driver/rados/rgw_bucketlogging.h Outdated
Comment thread src/rgw/driver/rados/rgw_bl_rados.cc
@nbalacha nbalacha force-pushed the wip-nbalacha-71365 branch 11 times, most recently from a6c7445 to 490c4fe Compare November 27, 2025 12:09
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc Outdated
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc Outdated
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc Outdated
Comment thread src/rgw/driver/rados/rgw_sal_rados.cc
Log buckets can now be created within erasure-coded (EC) pools.
To support append operations, a temporary log record object is initially
created in the replicated default.rgw.log pool. This object is then copied
to the EC pool upon log record commitment.
All implicit log commit operations will execute asynchronously. A new
BucketLoggingManager class is responsible for processing these pending
commits at set intervals. Explicit commit operations, however, will
continue to be performed synchronously.

Fixes: https://tracker.ceph.com/issues/71365

Signed-off-by: Nithya Balachandran <nithya.balachandran@ibm.com>
Run the rgw bucket logging teuthology tests on an erasure coded pool.

Signed-off-by: Nithya Balachandran <nithya.balachandran@ibm.com>
@yuvalif yuvalif self-requested a review December 1, 2025 13:37
@yuvalif
Copy link
Copy Markdown
Contributor

yuvalif commented Dec 1, 2025

jenkins test submodules

@yuvalif
Copy link
Copy Markdown
Contributor

yuvalif commented Dec 1, 2025

followup work (to be done in other PRs):

  • remove the must_commit parameter
    • in case of -EFBIG we should not return an error. this is done post rollover and new records will be written to the new obj (so going over size limit should not be an issue)
    • when doing explicit rollovers (conf change or flush) we should perform the commit syncronlusly since we are expected to reply with the "last committed" name. however, these operations are not idempotent, and client retries will not fixed failed commits, so, in case od failed commit, we should not reply with an error
    • optionally, we may perfrom an async commit if the sync commit failed
  • add bucket logging debug subsystem for all related code (e.g. rgw_bucket_logging.cc)
  • call read_global_logging_list() in chunks, instead of readign the entire list
  • use regex in parse_list_names()

@yuvalif
Copy link
Copy Markdown
Contributor

yuvalif commented Dec 2, 2025

bucket logging tests are passing: https://pulpito.ceph.com/nithyab-2025-12-01_15:29:39-rgw:bucket-logging-wip-nbalacha-71365-distro-default-smithi/
rgw regression has 19 failures: https://pulpito.ceph.com/nithyab-2025-12-01_16:01:59-rgw-wip-nbalacha-71365-distro-default-smithi/

  • d4n - known
  • multisite - known
  • valgrind error: SyscallParam sendmsg AsyncConnection::_try_send(bool) - known
  • 'sudo yum -y install ceph-osd-classic'/Unable to locate package ceph-osd-classic - in centos/ubuntu looks like test infra issue
  • File "/home/teuthworker/src/git.ceph.com_ceph-c_05918a1c67d8532b8d46e9abfd56ca9bbffe75f6/qa/tasks/radosgw_admin.py", line 993, in task assert(len(r) == 0) - known issue

@yuvalif yuvalif merged commit 822f175 into ceph:main Dec 2, 2025
19 checks passed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Dec 2, 2025

This is an automated message by src/script/redmine-upkeep.py.

I found one or more Fixes: tags in the commit messages in

git log 822f175f606c694e72519d9b6182e01be9b1d238^..822f175f606c694e72519d9b6182e01be9b1d238

The referenced tickets are:

Those tickets do not reference this merged Pull Request. If this Pull Request merge resolves any of those tickets, please update the "Pull Request ID" field on each ticket. A future run of this script will appropriately update them.

Update Log: https://github.com/ceph/ceph/actions/runs/19858758035

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants