crimson/osd/pg_backend: needn't check if os.exist #63144
Conversation
Matan-B
left a comment
There was a problem hiding this comment.
ceph_test_cls_cmpomap is being run from suites/rgw/verify/tasks/cls.yaml and therefore was never tested with Crimson.
Can we add ceph_test_cls_cmpomap either into makecheck unit tests or alternatively into our suite so we would avoid regressions?
| object_stat_sum_t& delta_stats) const | ||
| { | ||
| if (!os.exists || os.oi.is_whiteout()) { | ||
| logger().debug("{}: object does not exist: {}", __func__, os.oi.soid); |
There was a problem hiding this comment.
Can you please explain why is it expected to operate on non-exsiting objects in the commit messsage?
There was a problem hiding this comment.
since the test case expect false value return, not error code.
What do you mean by "false value"? Error codes are used internally - the raw error code value (int) is returned as an answer to the client eventually.
Looking at the cmp_set_vals_noexist_str test:
EXPECT_EQ(do_cmp_set_vals(oid, Mode::String, Op::EQ, {{"eq", value}}), 0);
0 is expected while Crimson returns -2.
I'm not sure I understand why Classic is returning 0 here since the object store actually returns -2 as well in omap_get_values:
if (!o || !o->exists) {
r = -ENOENT;
goto out;
}
I think that either:
- Classic should also return -2 and the test itself should be fixed.
CEPH_OSD_OP_FLAG_FAILOKis set, which hides the error and returns 0 instead - (why?)
Going back to my previous comment:
Can you please explain why is it expected to operate on non-existing objects in the commit message?
We should not change the behavior because the tests expects so as the test itself might possibly be wrong.
Lastly, I think that my comment regarding adding ceph_test_cls_cmpomap into makecheck or the crimson suite was missed. What do you think?
There was a problem hiding this comment.
0 is returned (when it shouldn't) for non-existing objects for classic:
if (oi.is_omap()) {
osd->store->omap_get_values(ch, ghobject_t(soid), keys_to_get, &out);
} // else return empty omap entries
...
The non-existing object is not an omap so no error is returned.
For Crimson:
if (oi.is_omap()) {
return store->omap_get_values(coll, ghobject_t{oi.soid}, keys_to_get);
} else {
return crimson::ct_error::enodata::make();
}
and then:
}).handle_error_interruptible(
crimson::ct_error::enodata::handle([&osd_op] {
..
osd_op.rval = 0;
return ll_read_errorator::now();
}),
We actually return enodata since this object is not an omap - but then we change the error code to zero (probably to align with classic).
I actually think we should return ENOENT here in both cases and change the test itself.
What do you think?
There was a problem hiding this comment.
@Matan-B Can't change the behavior without auditing users first. OMAPGETVALS, OMAPGETHEADER, and OMAPGETKEYS all seem to behave the same way -- empty, but successful return.
Short term, I'd suggest matching classic's behavior since this is a public interface other components already rely on.
Longer term, I do agree that the behavior is inconsistent with most of the other read ops, but changing the behavior would mean auditing rgw and rbd to make sure it won't break anything.
There was a problem hiding this comment.
Ah, so until the longer term solution is there we could do something like this:
if (!os.exists || os.oi.is_whiteout()) {
logger().debug("{}: object does not exist: {} returning empty result", __func__, os.oi.soid);
// Although an error should be expected here since the object doesn't exist,
// we want to match classic's behavior as other components already rely on it.
// Return an empty, but successful return instead.
//return crimson::ct_error::enoent::make();
return ll_read_errorator::now();
}
Instead of removing the check completely, we could keep it and explain what's going on to avoid future confusion here. Since the error "removal" is not trivial.
@athanatos, @liu-chunmei - What do you think?
As suggested earlier, let's also add this to our regular testing.
Matan-B
left a comment
There was a problem hiding this comment.
I think that Crimson is actually might be correct here.
| object_stat_sum_t& delta_stats) const | ||
| { | ||
| if (!os.exists || os.oi.is_whiteout()) { | ||
| logger().debug("{}: object does not exist: {}", __func__, os.oi.soid); |
There was a problem hiding this comment.
0 is returned (when it shouldn't) for non-existing objects for classic:
if (oi.is_omap()) {
osd->store->omap_get_values(ch, ghobject_t(soid), keys_to_get, &out);
} // else return empty omap entries
...
The non-existing object is not an omap so no error is returned.
For Crimson:
if (oi.is_omap()) {
return store->omap_get_values(coll, ghobject_t{oi.soid}, keys_to_get);
} else {
return crimson::ct_error::enodata::make();
}
and then:
}).handle_error_interruptible(
crimson::ct_error::enodata::handle([&osd_op] {
..
osd_op.rval = 0;
return ll_read_errorator::now();
}),
We actually return enodata since this object is not an omap - but then we change the error code to zero (probably to align with classic).
I actually think we should return ENOENT here in both cases and change the test itself.
What do you think?
|
jenkins test make check |
|
jenkins test make check arm64 |
|
jenkins retest this please |
|
@Matan-B and @athanatos remove that 2 line check and the remaining logic just like matan said as follows is reasonable for me. and then: |
080de9a to
88c874f
Compare
|
@Matan-B I remember we return osd_op.rval = 0; just to meet the test requirements to keep the same as classic return. |
Matan-B
left a comment
There was a problem hiding this comment.
LGTM, two comments:
The test will pass now due to:
}).handle_error_interruptible(
crimson::ct_error::enodata::handle([&osd_op] {
..
osd_op.rval = 0;
return ll_read_errorator::now();
}),
Can you please add a comment above:
// Although an error should be expected here since the object doesn't exist,
// we want to match classic's behavior as clients possibly rely on it.
// Return an empty, but successful return instead.
osd_op.rval = 0;
Lastly, can we please add that to the crimson-rados suite?
Thanks!
omap_get_vals_by_keys fix ./bin/ceph_test_cls_cmpomap error for noexist tests since the test case expect false value return, not error code. Signed-off-by: Chunmei Liu <chunmei.liu@ibm.com>
Signed-off-by: Chunmei Liu <chunmei.liu@ibm.com>
88c874f to
4461f23
Compare
done, add test_cls_cmpomap.sh to basic/tasks/cls.yaml |
|
@Matan-B the teuthology test result is at : https://pulpito.ceph.com/liucm-2025-07-02_21:32:59-crimson-rados-wip-liucm-omap-noexist-crimson-only-distro-crimson-debug-smithi/ |
|
for omap_get_vals_by_keys
fix ./bin/ceph_test_cls_cmpomap error for noexist tests
https://tracker.ceph.com/issues/71225
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins test classic perfJenkins Job | Jenkins Job Definitionjenkins test crimson perfJenkins Job | Jenkins Job Definitionjenkins test signedJenkins Job | Jenkins Job Definitionjenkins test make checkJenkins Job | Jenkins Job Definitionjenkins test make check arm64Jenkins Job | Jenkins Job Definitionjenkins test submodulesJenkins Job | Jenkins Job Definitionjenkins test dashboardJenkins Job | Jenkins Job Definitionjenkins test dashboard cephadmJenkins Job | Jenkins Job Definitionjenkins test apiJenkins Job | Jenkins Job Definitionjenkins test docsReadTheDocs | Github Workflow Definitionjenkins test ceph-volume allJenkins Jobs | Jenkins Jobs Definitionjenkins test windowsJenkins Job | Jenkins Job Definitionjenkins test rook e2eJenkins Job | Jenkins Job Definition