crimson/osd: execute `submit_error_log` as an ExclusivePhase by Matan-B · Pull Request #54287 · ceph/ceph

Matan-B · 2023-11-01T10:31:19Z

~~Blocked by: #54040~~

Introduced: #39772

Fixes: https://tracker.ceph.com/issues/61651

Contribution Guidelines

To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

Tracker (select at least one)
- References tracker ticket
- Very recent bug; references commit where it was introduced
- New feature (ticket optional)
- Doc update (no ticket needed)
- Code cleanup (no ticket needed)
Component impact
- Affects Dashboard, opened tracker ticket
- Affects Orchestrator, opened tracker ticket
- No impact that needs to be tracked
Documentation (select at least one)
- Updates relevant documentation
- No doc update is appropriate
Tests (select at least one)
- Includes unit test(s)
- Includes integration test(s)
- Includes bug reproducer
- No tests

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows

Matan-B · 2023-11-01T10:44:54Z

https://shaman.ceph.com/builds/ceph/wip-matanb-crimson-do_osd_ops_execute-v3/

src/crimson/osd/osd_operations/client_request.cc

src/crimson/osd/pg.cc

Matan-B · 2023-11-07T11:21:02Z

@athanatos, the call to error_log_fut was moved to do_osd_ops_execute and the original split between do_osd_ops_execute and submitted_fut is reverted. The diff is much cleaner now. ~~However, IIUC, it still doesn't solve the issue described in b02676b.~~

Matan-B · 2023-11-07T16:49:36Z

src/crimson/osd/pg.cc

+      return (*error_func_ptr)(e, rep_tid).then(
+      [failure_func_ptr, e, rep_tid] {
+        return PG::do_osd_ops_iertr::make_ready_future<pg_rep_op_fut_t<Ret>>(
+          std::move(seastar::now()),
+          std::move((*failure_func_ptr)(e, rep_tid))


1/2: This part lets submit_error_log run from within do_osd_ops_execute which resolves the original issue.

This one is chained after the future returned from submit_transaction, which would otherwise resolve into the submitted and completed futures. This one, I think, as you observe is the one that resolves the bug.

It seems to me that error_func_ptr could return an eversion_t, pass it into failure_func_ptr, and thereby avoid the log_entry_version map.

If we retain the log_entry_version map, it probably needs to be cleared during interval change.

It seems to me that error_func_ptr could return an eversion_t, pass it into failure_func_ptr, and thereby avoid the log_entry_version map.

After the last commit, ~~it is no longer possible.~~ (It's still possible in a way, but much less elegant than before)

If we retain the log_entry_version map, it probably needs to be cleared during interval change.

Fixed.

Matan-B · 2023-11-07T16:52:25Z

src/crimson/osd/pg.cc

+          [this, e, failure_func_ptr, error_func_ptr] {
+            auto rep_tid = shard_services.get_tid();
+            return (*error_func_ptr)(e, rep_tid)
+            .then([failure_func_ptr, e, rep_tid] {
+              return (*failure_func_ptr)(e , rep_tid);
+            });


2/2: The second submit_error_log call, since it's being run from within do_osd_ops_execute it should also be safe . Even though it's packed inside all_completed_fut below (in case of an error), it will be executed before returning pg_rep_op_fut_t to the caller.

Hmm, this handler is actually chained after _all_complete_fut and could therefore happen after we leave the exclusive phase. On the other hand, I think this case is actually impossible -- any error here would have to come from the _all_completed_fut returned from submit_transaction, which would have to be an error with the actual transaction submitted to the backing store or somehow with the messages received from peers. I think any error here could be treated as fatal to the OSD.

Hmm, this handler is actually chained after _all_complete_fut and could therefore happen after we leave the exclusive phase. On the other hand, I think this case is actually impossible -- any error here would have to come from the _all_completed_fut returned from submit_transaction, which would have to be an error with the actual transaction submitted to the backing store or somehow with the messages received from peers. I think any error here could be treated as fatal to the OSD.

This case was possible since there were non-fatal errors that could have been returned as well. I added an assert to verify that we handle only those.
Moreover, I moved the call to error_func_ptr earlier - before actually returning the non-fatal errors. That way, it will be executed during the exclusive phase.

Matan-B · 2023-11-08T16:33:58Z

https://pulpito.ceph.com/matan-2023-11-08_11:46:17-crimson-rados-wip-matanb-crimson-do_osd_ops_execute-v3-distro-crimson-smithi/

30/34 passed.
Results looks good (!)
The issue was previously seen consistently in the suite and seems to be resolved.

Failures look unrelated and are new since they were previously hidden by the issue resolved in this PR:

7451909 - osd.3 crash SnapTrimObjSubEvent lifetime (https://tracker.ceph.com/issues/63299)
7451911 - osd.1 and osd.2 - BlueStore::omap_get_values (Addresed: 7b8795a)
7451915 - osd.3 crash PG lifetime (Addresed: crimson/osd/pg: extend pg lifetime on snap_trimq iteration #54416)
7451923 - [ FAILED ] TestClsRbd.directory_methods

…func_t Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Payload is already decoded in IOHandler::read_message (decode_message). Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

* '!log_entries.empty()' assert instead of if-case. log_entries entry is inserted right before. * 'version != eversion_t()' assert instead of if-case. since op_info.may_write() is true, we should have a non-empty version. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

`submit_error_log()` was returning `version` to be used later in `failure_func` call to `complete_write()`. Maintain the version returned from `submit_error_log()` in a dedicated map to avoid handling the lifetime of 'version'. Note: This change is crucial to the following change that will return 'error_fut' separately. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

This change is crucial for the next commits, submit_error_log and failure_func should share the same rep_tid. to be shared later with error_log call Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B · 2023-11-16T10:04:06Z

Test results look good:
https://pulpito.ceph.com/matan-2023-11-16_08:02:56-crimson-rados-wip-matanb-crimson-testing-11-15-distro-crimson-smithi/

Unrelated:
7460428 - https://tracker.ceph.com/issues/63556
7460429 - https://tracker.ceph.com/issues/62162
7460443 - https://tracker.ceph.com/issues/62740
7460420 - Addressed: #54513

src/crimson/osd/pg.cc

Use chained futurized `send_to_osd()` instead of voided `send_cluster_message()`. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

``` submit_error_log records the result of an IO into the pg log so that we can return the same error code if the client resends the request. This should only be relevant for logical errors resulting from the target object state -- for example, EEXIST returned on an exclusive create -- because there is application logic built to rely on them. In classic, the only such site is if the return value from do_osd_ops is negative (or the transaction is empty) -- see PrimaryLogPG::prepare_transaction, specifically where we set update_log_only to true. We do not want to record space usage errors or errors specific to conditions on the primary OSD such as IO errors -- submit_error_log isn't a catch-all error path. ``` Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Previously, submit_error_log was chained to failure_func returned future. Now submit_error_log is called from within do_osd_ops_execute Fixes: https://tracker.ceph.com/issues/61651 Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B · 2023-11-20T11:06:24Z

https://pulpito.ceph.com/matan-2023-11-19_17:04:13-crimson-rados-wip-matanb-crimson-testing-11-19-distro-crimson-smithi/

athanatos

There's more simplification to be done, but I'd be ok with doing it in a subsequent PR if this one is passing tests. Up to you.

src/crimson/osd/pg.cc

athanatos · 2023-11-28T23:39:13Z

src/crimson/osd/pg.cc

+    return maybe_rollback_fut.then_interruptible(
+    [error_func_ptr, e, rep_tid, failure_func_ptr] {
+      // record error log
+      return (*error_func_ptr)(e, rep_tid).then(


This appears to be the only actual user of error_func_ptr -- why not simply move the declaration of maybe_submit_error_log here and avoid needing error_func_ptr entirely?

I preferred capturing all of the dependencies here instead of moving them later on

athanatos · 2023-11-28T23:43:41Z

src/crimson/osd/pg.cc

+        peering_state.complete_write(log_entry_version[rep_tid], last_complete);
+        log_entry_version.erase(rep_tid);
+        logger().debug("do_osd_ops_execute::failure_func write complete,"
+                        " erasing rep_tid {}", rep_tid);


This else branch seems like it should have been moved into do_osd_ops_execute as well. I'd probably factor it into a helper (complete_error_log) and chain it after the error_func_ptr invocation.

Matan-B · 2023-11-29T08:15:11Z

There's more simplification to be done, but I'd be ok with doing it in a subsequent PR if this one is passing tests. Up to you.

I prefer addressing the cleanups in a subsequent PR to let main branch not to have this issue anymore. Thanks!

cyx1231st · 2023-12-06T03:55:48Z

src/crimson/osd/pg.cc

+    return maybe_rollback_fut.then_interruptible(
+    [error_func_ptr, e, rep_tid, failure_func_ptr] {
+      // record error log
+      return (*error_func_ptr)(e, rep_tid).then(


Ack, submit_error_log() is moved from the concurrent wait_repop phase to the exclusive process phase, which should address the ordering issue.

Not encountered since merge yet :)

Matan-B mentioned this pull request Nov 1, 2023

[WIP] crimson/osd: execute submit_error_log as an ExclusivePhase #54248

Closed

14 tasks

github-actions bot added the crimson label Nov 1, 2023

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from 3d7efa6 to 1aad6da Compare November 1, 2023 10:43

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch 11 times, most recently from 9a7618a to cc0b061 Compare November 6, 2023 14:19

Matan-B requested a review from xxhdx1985126 November 6, 2023 14:25

cyx1231st self-requested a review November 7, 2023 01:14

athanatos requested changes Nov 7, 2023

View reviewed changes

src/crimson/osd/osd_operations/client_request.cc Outdated Show resolved Hide resolved

src/crimson/osd/pg.cc Outdated Show resolved Hide resolved

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from cc0b061 to b02676b Compare November 7, 2023 11:12

Matan-B requested a review from athanatos November 7, 2023 11:16

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from b02676b to 73016ca Compare November 7, 2023 16:48

Matan-B commented Nov 7, 2023

View reviewed changes

Matan-B marked this pull request as ready for review November 7, 2023 16:52

Matan-B requested a review from a team as a code owner November 7, 2023 16:52

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch 2 times, most recently from a1bf4b8 to 4692a09 Compare November 8, 2023 16:21

Matan-B added the wip-matan-testing label Nov 14, 2023

Matan-B added 8 commits November 15, 2023 16:12

crimson/osd: remove do_osd_ops_success_func_t and do_osd_ops_failure_…

e5aeade

…func_t Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/shard_services: add comment to next_tid initialization

7cd0aa0

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/osd.cc: handle_update_log_missing* don't decode payload.

bf3845c

Payload is already decoded in IOHandler::read_message (decode_message). Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/pg: add logs around submit_error_log()

e3de7c0

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/pg: add logs and assert around log_entry_update_waiting_on

049e071

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/pg: rep_tid as a faillure_func param

54fd676

This change is crucial for the next commits, submit_error_log and failure_func should share the same rep_tid. to be shared later with error_log call Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from 740ce28 to 38a6b7c Compare November 16, 2023 10:06

Matan-B added the TESTED label Nov 16, 2023

athanatos reviewed Nov 17, 2023

View reviewed changes

src/crimson/osd/pg.cc Outdated Show resolved Hide resolved

athanatos reviewed Nov 17, 2023

View reviewed changes

src/crimson/osd/pg.cc Outdated Show resolved Hide resolved

Matan-B added 3 commits November 19, 2023 09:46

crimson/osd/pg: submit_error_log send messages to osd by order

4bd7b94

Use chained futurized `send_to_osd()` instead of voided `send_cluster_message()`. Signed-off-by: Matan Breizman <mbreizma@redhat.com>

crimson/osd/pg: do_osd_ops_execute assert error type handling

74965cb

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B removed the TESTED label Nov 19, 2023

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from 38a6b7c to 2c5fbee Compare November 19, 2023 12:27

crimson/osd/pg: introduce clear_log_entry_maps()

4531290

Signed-off-by: Matan Breizman <mbreizma@redhat.com>

Matan-B force-pushed the wip-matanb-crimson-do_osd_ops_execute-v3 branch from 2c5fbee to 4531290 Compare November 19, 2023 12:31

Matan-B requested a review from athanatos November 19, 2023 12:32

Matan-B added the TESTED label Nov 20, 2023

Matan-B mentioned this pull request Nov 22, 2023

crimson/osd/osd_operations/client_request: recover the head and other necessary objects before proceeding #53306

Merged

14 tasks

athanatos approved these changes Nov 28, 2023

View reviewed changes

Matan-B merged commit 3d760f1 into ceph:main Nov 29, 2023

Matan-B mentioned this pull request Dec 4, 2023

crimson/osd: submit_error_log cleanup #54765

Merged

14 tasks

cyx1231st reviewed Dec 6, 2023

View reviewed changes

Conversation

Matan-B commented Nov 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution Guidelines

Checklist

Uh oh!

Matan-B commented Nov 1, 2023

Uh oh!

Uh oh!

Uh oh!

Matan-B commented Nov 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Matan-B Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Matan-B Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Matan-B commented Nov 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Matan-B commented Nov 16, 2023

Uh oh!

Uh oh!

Uh oh!

Matan-B commented Nov 20, 2023

Uh oh!

athanatos left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Matan-B commented Nov 29, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Matan-B commented Nov 1, 2023 •

edited

Loading

Matan-B commented Nov 7, 2023 •

edited

Loading

Matan-B Nov 15, 2023 •

edited

Loading

Matan-B Nov 15, 2023 •

edited

Loading

Matan-B commented Nov 8, 2023 •

edited

Loading