Skip to content

crimson/osd/osd_operations/client_requests: we don't support rados locator keys#56025

Merged
Matan-B merged 4 commits intoceph:mainfrom
xxhdx1985126:wip-seastore-onode-loc-key
Apr 21, 2024
Merged

crimson/osd/osd_operations/client_requests: we don't support rados locator keys#56025
Matan-B merged 4 commits intoceph:mainfrom
xxhdx1985126:wip-seastore-onode-loc-key

Conversation

@xxhdx1985126
Copy link
Copy Markdown
Member

@xxhdx1985126 xxhdx1985126 commented Mar 7, 2024

Fixes: https://tracker.ceph.com/issues/64782
Signed-off-by: Xuehan Xu xuxuehan@qianxin.com

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@xxhdx1985126 xxhdx1985126 requested a review from cyx1231st March 7, 2024 08:27
@xxhdx1985126 xxhdx1985126 requested a review from a team as a code owner March 7, 2024 08:27
@xxhdx1985126 xxhdx1985126 force-pushed the wip-seastore-onode-loc-key branch from 5b5f330 to c35b95b Compare March 7, 2024 08:42
@cyx1231st
Copy link
Copy Markdown
Member

I'm not quite following the purpose of the object-locator-key, and not sure why it was initially missing during the design of onode key for Crimson (is it intentional or not).

From https://tracker.ceph.com/projects/ceph/wiki/Rados_namespaces, is it an optional abstraction layer between namespace and oid? Should Crimson support it as well?

@cyx1231st cyx1231st requested a review from athanatos March 8, 2024 09:09
@xxhdx1985126
Copy link
Copy Markdown
Member Author

I'm not quite following the purpose of the object-locator-key, and not sure why it was initially missing during the design of onode key for Crimson (is it intentional or not).

From https://tracker.ceph.com/projects/ceph/wiki/Rados_namespaces, is it an optional abstraction layer between namespace and oid? Should Crimson support it as well?

RADOS maps objects to pgs by the objects' locator keys, by default, the key is empty and the object name is treated as the key.

@cyx1231st
Copy link
Copy Markdown
Member

RADOS maps objects to pgs by the objects' locator keys, by default, the key is empty and the object name is treated as the key.

It seems to me that we can replace/reuse the oid field by the locator key if the locator is valid, and store the oid in the onode value in that case. So there is no need to add another string field in the onode key?

@xxhdx1985126
Copy link
Copy Markdown
Member Author

RADOS maps objects to pgs by the objects' locator keys, by default, the key is empty and the object name is treated as the key.

It seems to me that we can replace/reuse the oid field by the locator key if the locator is valid, and store the oid in the onode value in that case. So there is no need to add another string field in the onode key?

Yeah, will change it.

@xxhdx1985126 xxhdx1985126 force-pushed the wip-seastore-onode-loc-key branch from c35b95b to 6eb0778 Compare March 14, 2024 06:24
@xxhdx1985126
Copy link
Copy Markdown
Member Author

@cyx1231st Please take a look again, thanks:-)

Copy link
Copy Markdown
Member

@cyx1231st cyx1231st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments.

}
string_view_masked_t key_masked() const {
// TODO(cross-node string dedup)
return oid_masked();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem that we need both key() and oid() here, and the same to some other places around the onode index changes.

IIUC with this PR, the oid field of the onode tree index is replaced by the concept of effective_key of hobject_t, and there is no need to distingush key and oid from onode indexing perspective.

And if that can be aligned, how about splitting out the renaming efforts (oid -> effective_key) as a separate commit?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the onode deltas, I think we need to remenber both oid and key, which is why I kept the oid() method

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the onode deltas, I think we need to remenber both oid and key

Can you elaborate more about why need to remember both?

gen_t gen;
ceph::decode(gen, delta);
return key_hobj_t(ghobject_t(
shard_id_t(shard), pool, crush, nspace, oid, snap, gen));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine to adjust the constructor of ghobject_t directly. IIUC it is the only user of the specific constructor.

onode->create_default_layout(trans, hoid.hobj.oid.name);
} else {
onode->create_default_layout(trans, key);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct that the onode value needs to store the oid only if the locator is valid (!key.empty()), otherwise, there is no need to store or even reserve the value space?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it correct that the onode value needs to store the oid only if the locator is valid (!key.empty()), otherwise, there is no need to store or even reserve the value space?

Yes, but I think the size of onode_layout_t has to be a constant value, is this right?

Copy link
Copy Markdown
Member

@cyx1231st cyx1231st Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the reason why we were not aware of the issue is because that it's rare key.empty() is false, so we mostly don't need the space to store the oid in the value.

Yes, but I think the size of onode_layout_t has to be a constant value, is this right?

I think we might be able to store it in xattr omap with a special internal-only label. So oid becomes really optional. What do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be able to store it in xattr omap with a special internal-only label. So oid becomes really optional. What do you think?

I think we need a way to determine whether the "oid" field in the key is actually the oid or locator key, at least when listing objects, so that we can reconstruct the ghobject_t correctly. If we put oid in xattr, we'll need to search the omap tree for each of the objects that are listed. This seems to be a potential performance issue.

@cyx1231st
Copy link
Copy Markdown
Member

From @athanatos's feedback, there is no user of the locator field, probably we should:

  1. return -ENOTSUPP for any incoming IO with a locator set in crimson-osd
  2. skip this test for crimson

Quoted:

Unfortunately, supporting locator correctly would require effective-key in addition to oid -> (shard, pool, reversed-hash, ns, effective-key, oid, snap, gen). However, there are no users of locator that I'm aware of at all. Once upon a time, it was used to ensure that multipart-upload parts would end up in the same PG so that they could be clone_range'd into the same object via librados, but that's not how multipart upload works anymore and we no longer support clone_range via librados.

We probably ought to remove support for it from librados entirely.

@xxhdx1985126
Copy link
Copy Markdown
Member Author

From @athanatos's feedback, there is no user of the locator field, probably we should:

  1. return -ENOTSUPP for any incoming IO with a locator set in crimson-osd
  2. skip this test for crimson

Quoted:

Unfortunately, supporting locator correctly would require effective-key in addition to oid -> (shard, pool, reversed-hash, ns, effective-key, oid, snap, gen). However, there are no users of locator that I'm aware of at all. Once upon a time, it was used to ensure that multipart-upload parts would end up in the same PG so that they could be clone_range'd into the same object via librados, but that's not how multipart upload works anymore and we no longer support clone_range via librados.
We probably ought to remove support for it from librados entirely.

Yeah, will do.

@xxhdx1985126 xxhdx1985126 force-pushed the wip-seastore-onode-loc-key branch from 6eb0778 to 0f71bf1 Compare March 18, 2024 06:45
@xxhdx1985126 xxhdx1985126 changed the title crimson/os/seastore/onode_manager: store objects' location key in the onode tree crimson/osd/osd_operations/client_requests: we don't support rados locator keys Mar 18, 2024
@xxhdx1985126
Copy link
Copy Markdown
Member Author

@cyx1231st Done, please take a look again, thanks.

// TODO: gate the crosscore sending
return get_foreign_connection().send_with_throttling(std::move(reply));
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a meanless change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, reply_op_error() returns deduced type, so it has to be defined before used. So I moved its definition before with_pg_int()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think that returning seastar::future<> is more readable and avoids the issue you encountered.

@xxhdx1985126 xxhdx1985126 force-pushed the wip-seastore-onode-loc-key branch from 0f71bf1 to 6be84df Compare March 18, 2024 09:12
@xxhdx1985126
Copy link
Copy Markdown
Member Author

jenkins test make check

@Matan-B
Copy link
Copy Markdown
Contributor

Matan-B commented Apr 2, 2024

@xxhdx1985126 xxhdx1985126 force-pushed the wip-seastore-onode-loc-key branch from 2c1975e to 776c2b3 Compare April 7, 2024 06:29
@xxhdx1985126
Copy link
Copy Markdown
Member Author

jenkins test make check

@xxhdx1985126
Copy link
Copy Markdown
Member Author

The following tests FAILED:
22 - run-tox-cephadm (Failed)

jenkins test make check

@xxhdx1985126
Copy link
Copy Markdown
Member Author

jenkins test make check

@xxhdx1985126
Copy link
Copy Markdown
Member Author

@Matan-B I think the issue should have been fixed, please test it again, thanks:-)

@Matan-B
Copy link
Copy Markdown
Contributor

Matan-B commented Apr 10, 2024

@Matan-B I think the issue should have been fixed, please test it again, thanks:-)

https://shaman.ceph.com/builds/ceph/wip-matanb-crimson-testing-april-17

Copy link
Copy Markdown
Contributor

@Matan-B Matan-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://pulpito.ceph.com/matan-2024-04-21_07:41:30-crimson-rados-wip-matanb-crimson-only-testing-april-17-distro-crimson-smithi/7666494

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:=================================== FAILURES ===================================
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:____________________________ TestIoctx.test_locator ____________________________

@xxhdx1985126
Copy link
Copy Markdown
Member Author

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:=================================== FAILURES ===================================
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:____________________________ TestIoctx.test_locator ____________________________
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:self = <test_rados.TestIoctx object at 0x7f170bb6b8b0>
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:../../../clone.client.0/src/test/pybind/test_rados.py:719:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:

Seems that the test branch lacks this PR's commit b3754ac

@xxhdx1985126
Copy link
Copy Markdown
Member Author

jenkins test make check

@Matan-B
Copy link
Copy Markdown
Contributor

Matan-B commented Apr 21, 2024

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:=================================== FAILURES ===================================
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:____________________________ TestIoctx.test_locator ____________________________
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:self = <test_rados.TestIoctx object at 0x7f170bb6b8b0>
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:../../../clone.client.0/src/test/pybind/test_rados.py:719:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:

Seems that the test branch lacks this PR's commit b3754ac

I think that it's there, please see:
https://github.com/ceph/ceph-ci/commits/wip-matanb-crimson-only-testing-april-17/

@xxhdx1985126
Copy link
Copy Markdown
Member Author

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:=================================== FAILURES ===================================
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:____________________________ TestIoctx.test_locator ____________________________
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:self = <test_rados.TestIoctx object at 0x7f170bb6b8b0>
2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:../../../clone.client.0/src/test/pybind/test_rados.py:719:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:

Seems that the test branch lacks this PR's commit b3754ac

I think that it's there, please see: https://github.com/ceph/ceph-ci/commits/wip-matanb-crimson-only-testing-april-17/

I don't quite understand, but, looking at the error log above, the code doesn't have the change of commit b3754ac

@Matan-B
Copy link
Copy Markdown
Contributor

Matan-B commented Apr 21, 2024

Seems that the test branch lacks this PR's commit b3754ac

I think that it's there, please see: https://github.com/ceph/ceph-ci/commits/wip-matanb-crimson-only-testing-april-17/

I don't quite understand, but, looking at the error log above, the code doesn't have the change of commit b3754ac

https://qa-proxy.ceph.com/teuthology/matan-2024-04-21_07:41:30-crimson-rados-wip-matanb-crimson-only-testing-april-17-distro-crimson-smithi/7666494/teuthology.log

From the log:

2024-04-21T07:53:48.724 INFO:teuthology.packaging:sha1: b52414e0249b7db43b05804d90a0d3f7a8047037

The sha1 is represents this branch: https://github.com/ceph/ceph-ci/commits/wip-matanb-crimson-only-testing-april-17/
which includes ceph/ceph-ci@1e8f16a (cherry-pick of b3754ac).

@xxhdx1985126
Copy link
Copy Markdown
Member Author

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')

I know, but the code observed (shown below) in the log is wrong, maybe later commits overwrote this part of the code:

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')

@Matan-B
Copy link
Copy Markdown
Contributor

Matan-B commented Apr 21, 2024

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')

I know, but the code observed (shown below) in the log is wrong, maybe later commits overwrote this part of the code:

2024-04-21T08:04:59.490 INFO:tasks.workunit.client.0.smithi186.stdout:
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: def test_locator(self):
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout: self.ioctx.set_locator_key("bar")
2024-04-21T08:04:59.491 INFO:tasks.workunit.client.0.smithi186.stdout:> self.ioctx.write('foo', b'contents1')
2024-04-21T08:00:18.769 INFO:tasks.workunit.client.0.smithi186.stderr:+ python3 -m pytest -v /home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/../../../src/test/pybind/test_rados.py -m 'not (wait or tier or ec or bench or stats)'

Found the issue, I was testing against main and I should have tested against the PR as a suite. Retesting.

https://pulpito.ceph.com/matan-2024-04-21_12:49:32-crimson-rados-wip-matanb-crimson-only-testing-april-17-distro-crimson-smithi/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants