crimson/osd: execute submit_error_log as an ExclusivePhase#54287
crimson/osd: execute submit_error_log as an ExclusivePhase#54287
submit_error_log as an ExclusivePhase#54287Conversation
3d7efa6 to
1aad6da
Compare
9a7618a to
cc0b061
Compare
cc0b061 to
b02676b
Compare
|
@athanatos, the call to error_log_fut was moved to do_osd_ops_execute and the original split between do_osd_ops_execute and submitted_fut is reverted. The diff is much cleaner now. |
b02676b to
73016ca
Compare
src/crimson/osd/pg.cc
Outdated
| return (*error_func_ptr)(e, rep_tid).then( | ||
| [failure_func_ptr, e, rep_tid] { | ||
| return PG::do_osd_ops_iertr::make_ready_future<pg_rep_op_fut_t<Ret>>( | ||
| std::move(seastar::now()), | ||
| std::move((*failure_func_ptr)(e, rep_tid)) |
There was a problem hiding this comment.
1/2: This part lets submit_error_log run from within do_osd_ops_execute which resolves the original issue.
There was a problem hiding this comment.
This one is chained after the future returned from submit_transaction, which would otherwise resolve into the submitted and completed futures. This one, I think, as you observe is the one that resolves the bug.
There was a problem hiding this comment.
It seems to me that error_func_ptr could return an eversion_t, pass it into failure_func_ptr, and thereby avoid the log_entry_version map.
If we retain the log_entry_version map, it probably needs to be cleared during interval change.
There was a problem hiding this comment.
It seems to me that error_func_ptr could return an eversion_t, pass it into failure_func_ptr, and thereby avoid the log_entry_version map.
After the last commit, it is no longer possible. (It's still possible in a way, but much less elegant than before)
If we retain the log_entry_version map, it probably needs to be cleared during interval change.
Fixed.
src/crimson/osd/pg.cc
Outdated
| [this, e, failure_func_ptr, error_func_ptr] { | ||
| auto rep_tid = shard_services.get_tid(); | ||
| return (*error_func_ptr)(e, rep_tid) | ||
| .then([failure_func_ptr, e, rep_tid] { | ||
| return (*failure_func_ptr)(e , rep_tid); | ||
| }); |
There was a problem hiding this comment.
2/2: The second submit_error_log call, since it's being run from within do_osd_ops_execute it should also be safe . Even though it's packed inside all_completed_fut below (in case of an error), it will be executed before returning pg_rep_op_fut_t to the caller.
There was a problem hiding this comment.
Hmm, this handler is actually chained after _all_complete_fut and could therefore happen after we leave the exclusive phase. On the other hand, I think this case is actually impossible -- any error here would have to come from the _all_completed_fut returned from submit_transaction, which would have to be an error with the actual transaction submitted to the backing store or somehow with the messages received from peers. I think any error here could be treated as fatal to the OSD.
There was a problem hiding this comment.
Hmm, this handler is actually chained after _all_complete_fut and could therefore happen after we leave the exclusive phase. On the other hand, I think this case is actually impossible -- any error here would have to come from the _all_completed_fut returned from submit_transaction, which would have to be an error with the actual transaction submitted to the backing store or somehow with the messages received from peers. I think any error here could be treated as fatal to the OSD.
This case was possible since there were non-fatal errors that could have been returned as well. I added an assert to verify that we handle only those.
Moreover, I moved the call to error_func_ptr earlier - before actually returning the non-fatal errors. That way, it will be executed during the exclusive phase.
a1bf4b8 to
4692a09
Compare
|
30/34 passed. Failures look unrelated and are new since they were previously hidden by the issue resolved in this PR:
|
…func_t Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Payload is already decoded in IOHandler::read_message (decode_message). Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
* '!log_entries.empty()' assert instead of if-case. log_entries entry is inserted right before. * 'version != eversion_t()' assert instead of if-case. since op_info.may_write() is true, we should have a non-empty version. Signed-off-by: Matan Breizman <mbreizma@redhat.com>
`submit_error_log()` was returning `version` to be used later in
`failure_func` call to `complete_write()`.
Maintain the version returned from `submit_error_log()` in a dedicated map
to avoid handling the lifetime of 'version'.
Note: This change is crucial to the following change that will
return 'error_fut' separately.
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
This change is crucial for the next commits, submit_error_log and failure_func should share the same rep_tid. to be shared later with error_log call Signed-off-by: Matan Breizman <mbreizma@redhat.com>
|
Test results look good: Unrelated: |
740ce28 to
38a6b7c
Compare
Use chained futurized `send_to_osd()` instead of voided `send_cluster_message()`. Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
``` submit_error_log records the result of an IO into the pg log so that we can return the same error code if the client resends the request. This should only be relevant for logical errors resulting from the target object state -- for example, EEXIST returned on an exclusive create -- because there is application logic built to rely on them. In classic, the only such site is if the return value from do_osd_ops is negative (or the transaction is empty) -- see PrimaryLogPG::prepare_transaction, specifically where we set update_log_only to true. We do not want to record space usage errors or errors specific to conditions on the primary OSD such as IO errors -- submit_error_log isn't a catch-all error path. ``` Signed-off-by: Matan Breizman <mbreizma@redhat.com>
Previously, submit_error_log was chained to failure_func returned future. Now submit_error_log is called from within do_osd_ops_execute Fixes: https://tracker.ceph.com/issues/61651 Signed-off-by: Matan Breizman <mbreizma@redhat.com>
38a6b7c to
2c5fbee
Compare
Signed-off-by: Matan Breizman <mbreizma@redhat.com>
2c5fbee to
4531290
Compare
athanatos
left a comment
There was a problem hiding this comment.
There's more simplification to be done, but I'd be ok with doing it in a subsequent PR if this one is passing tests. Up to you.
| return maybe_rollback_fut.then_interruptible( | ||
| [error_func_ptr, e, rep_tid, failure_func_ptr] { | ||
| // record error log | ||
| return (*error_func_ptr)(e, rep_tid).then( |
There was a problem hiding this comment.
This appears to be the only actual user of error_func_ptr -- why not simply move the declaration of maybe_submit_error_log here and avoid needing error_func_ptr entirely?
There was a problem hiding this comment.
I preferred capturing all of the dependencies here instead of moving them later on
| peering_state.complete_write(log_entry_version[rep_tid], last_complete); | ||
| log_entry_version.erase(rep_tid); | ||
| logger().debug("do_osd_ops_execute::failure_func write complete," | ||
| " erasing rep_tid {}", rep_tid); |
There was a problem hiding this comment.
This else branch seems like it should have been moved into do_osd_ops_execute as well. I'd probably factor it into a helper (complete_error_log) and chain it after the error_func_ptr invocation.
I prefer addressing the cleanups in a subsequent PR to let main branch not to have this issue anymore. Thanks! |
| return maybe_rollback_fut.then_interruptible( | ||
| [error_func_ptr, e, rep_tid, failure_func_ptr] { | ||
| // record error log | ||
| return (*error_func_ptr)(e, rep_tid).then( |
There was a problem hiding this comment.
Ack, submit_error_log() is moved from the concurrent wait_repop phase to the exclusive process phase, which should address the ordering issue.
There was a problem hiding this comment.
Not encountered since merge yet :)
Blocked by: #54040Introduced: #39772
Fixes: https://tracker.ceph.com/issues/61651
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windows