Skip to content

cephadm: Add chown to unit.run for adoped simple OSDs#34703

Merged
sebastian-philipp merged 2 commits intoceph:masterfrom
SUSE:wip-fix-45129
May 5, 2020
Merged

cephadm: Add chown to unit.run for adoped simple OSDs#34703
sebastian-philipp merged 2 commits intoceph:masterfrom
SUSE:wip-fix-45129

Conversation

@tserong
Copy link
Member

@tserong tserong commented Apr 23, 2020

We're not using ceph-volume simple activate anymore, so need to add chown ceph:ceph for each of block, block.db and block.wal to ensure old-style simple OSDs are able to start.

This PR also addresses a couple of the comments in #34565, but I didn't change it to loop over all devices when looking at ceph.type in check_offline_lvm_osd() - based on the ceph-volume lvm prepare code, there may be several different types, but finding one that's 'block' indicates bluestore, and finding one that's 'data' indicates filestore.

Fixes: https://tracker.ceph.com/issues/45129
Signed-off-by: Tim Serong tserong@suse.com

@tserong tserong requested a review from a team as a code owner April 23, 2020 07:25
@tserong
Copy link
Member Author

tserong commented Apr 23, 2020

Just to make the output of this change clear, it injects e.g. the following three lines at the start of unit.run:

[ -L /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block ] && chown 167:167 /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block
[ -L /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block.db ] && chown 167:167 /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block.db
[ -L /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block.wal ] && chown 167:167 /var/lib/ceph/f6cf67cc-dcfa-3de8-a33f-cab60875412f/osd.10/block.wal

These lines are only added for simple OSDs, not for LVM OSDs.

Copy link
Contributor

@mgfritch mgfritch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, I think it would be nice to avoid adding the osd_simple bool to deploy_daemon_units ...

otherwise, minus the mypy error, lgtm!

@tserong
Copy link
Member Author

tserong commented Apr 24, 2020

If possible, I think it would be nice to avoid adding the osd_simple bool to deploy_daemon_units ...

Now that you mention it, it's probably safe to always have the chowns in there, as they happen first before the lvm activate step. If that's true, we can ditch the osd_simple bool. I'll do a little further testing.

@tserong
Copy link
Member Author

tserong commented Apr 24, 2020

Confirmed, chowning these things doesn't have any adverse affect on LVM OSDs, so we could get rid of the osd_simple flag. OTOH, that would mean that all OSDs will always have these chown lines, even newly created ones (i.e. not only adopted ones), which while harmless may arguably be slightly messy.

@sebastian-philipp, what do you think?

@sebastian-philipp
Copy link
Contributor

Right. Not having to adjust the logic here make things simpler.

@jan--f do you have anything against unconditionally doing the chown?

Copy link
Contributor

@jan--f jan--f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

tserong added 2 commits April 28, 2020 13:36
We're not using `ceph-volume simple activate` anymore, so need to
add `chown ceph:ceph` for each of block, block.db and block.wal
to ensure old-style simple OSDs are able to start.

Fixes: https://tracker.ceph.com/issues/45129
Signed-off-by: Tim Serong <tserong@suse.com>
This addresses a couple of the comments in
ceph#34565

Signed-off-by: Tim Serong <tserong@suse.com>
@tserong
Copy link
Member Author

tserong commented Apr 28, 2020

OK, I've removed the osd_simple flag, tweaked the comments slightly, and flipped the [ ! -L ] || chown logic.

Copy link
Contributor

@mgfritch mgfritch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@sebastian-philipp
Copy link
Contributor

sebastian-philipp commented May 4, 2020

image

@mgfritch any update on this run?

@mgfritch
Copy link
Contributor

mgfritch commented May 5, 2020

@sebastian-philipp sebastian-philipp merged commit 3924fd8 into ceph:master May 5, 2020
@tserong tserong deleted the wip-fix-45129 branch May 5, 2020 10:30
tserong added a commit to SUSE/ceph that referenced this pull request Aug 5, 2020
This changes the logic in deploy_daemon_units() to add either `chown` calls
for simple (ceph-disk style) OSDs, or to add `ceph-volume lvm activate` calls
for LVM OSDs, rather than always adding both.  When I was working on
ceph#34703, I'd originally added an "osd_simple"
flag to figure out what type of OSD was being adopted/deployed, but passing
that around was kinda ugly, so was removed from that PR.  This time around
I'm checking if /etc/ceph/osd/$OSD_ID-$OSD_FSID.json.adopted-by-cephadm
exists, which seems pretty safe IMO.  My only concern with this method is:
what happens if someone adopts a simple OSD, then later wants to migrate it
to LVM.  Presumably that's a destroy and recreate, keeping the same OSD ID?
If that's true, then the JSON file probably still exists, so the subsequent
create will do the wrong thing, i.e. will add `chown` calls, not `ceph-volume
lvm activate` calls.  Any/all feedback appreciated...

Fixes: https://tracker.ceph.com/issues/46833
Signed-off-by: Tim Serong <tserong@suse.com>
TySheehan pushed a commit to TySheehan/ceph that referenced this pull request Aug 5, 2020
This changes the logic in deploy_daemon_units() to add either `chown` calls
for simple (ceph-disk style) OSDs, or to add `ceph-volume lvm activate` calls
for LVM OSDs, rather than always adding both.  When I was working on
ceph#34703, I'd originally added an "osd_simple"
flag to figure out what type of OSD was being adopted/deployed, but passing
that around was kinda ugly, so was removed from that PR.  This time around
I'm checking if /etc/ceph/osd/$OSD_ID-$OSD_FSID.json.adopted-by-cephadm
exists, which seems pretty safe IMO.  My only concern with this method is:
what happens if someone adopts a simple OSD, then later wants to migrate it
to LVM.  Presumably that's a destroy and recreate, keeping the same OSD ID?
If that's true, then the JSON file probably still exists, so the subsequent
create will do the wrong thing, i.e. will add `chown` calls, not `ceph-volume
lvm activate` calls.  Any/all feedback appreciated...

Fixes: https://tracker.ceph.com/issues/46833
Signed-off-by: Tim Serong <tserong@suse.com>
AmnonHanuhov pushed a commit to AmnonHanuhov/ceph that referenced this pull request Aug 10, 2020
This changes the logic in deploy_daemon_units() to add either `chown` calls
for simple (ceph-disk style) OSDs, or to add `ceph-volume lvm activate` calls
for LVM OSDs, rather than always adding both.  When I was working on
ceph#34703, I'd originally added an "osd_simple"
flag to figure out what type of OSD was being adopted/deployed, but passing
that around was kinda ugly, so was removed from that PR.  This time around
I'm checking if /etc/ceph/osd/$OSD_ID-$OSD_FSID.json.adopted-by-cephadm
exists, which seems pretty safe IMO.  My only concern with this method is:
what happens if someone adopts a simple OSD, then later wants to migrate it
to LVM.  Presumably that's a destroy and recreate, keeping the same OSD ID?
If that's true, then the JSON file probably still exists, so the subsequent
create will do the wrong thing, i.e. will add `chown` calls, not `ceph-volume
lvm activate` calls.  Any/all feedback appreciated...

Fixes: https://tracker.ceph.com/issues/46833
Signed-off-by: Tim Serong <tserong@suse.com>
sebastian-philipp pushed a commit to sebastian-philipp/ceph that referenced this pull request Aug 21, 2020
This changes the logic in deploy_daemon_units() to add either `chown` calls
for simple (ceph-disk style) OSDs, or to add `ceph-volume lvm activate` calls
for LVM OSDs, rather than always adding both.  When I was working on
ceph#34703, I'd originally added an "osd_simple"
flag to figure out what type of OSD was being adopted/deployed, but passing
that around was kinda ugly, so was removed from that PR.  This time around
I'm checking if /etc/ceph/osd/$OSD_ID-$OSD_FSID.json.adopted-by-cephadm
exists, which seems pretty safe IMO.  My only concern with this method is:
what happens if someone adopts a simple OSD, then later wants to migrate it
to LVM.  Presumably that's a destroy and recreate, keeping the same OSD ID?
If that's true, then the JSON file probably still exists, so the subsequent
create will do the wrong thing, i.e. will add `chown` calls, not `ceph-volume
lvm activate` calls.  Any/all feedback appreciated...

Fixes: https://tracker.ceph.com/issues/46833
Signed-off-by: Tim Serong <tserong@suse.com>
(cherry picked from commit 8112949)
ideepika pushed a commit to ceph/ceph-ci that referenced this pull request Sep 3, 2020
This changes the logic in deploy_daemon_units() to add either `chown` calls
for simple (ceph-disk style) OSDs, or to add `ceph-volume lvm activate` calls
for LVM OSDs, rather than always adding both.  When I was working on
ceph/ceph#34703, I'd originally added an "osd_simple"
flag to figure out what type of OSD was being adopted/deployed, but passing
that around was kinda ugly, so was removed from that PR.  This time around
I'm checking if /etc/ceph/osd/$OSD_ID-$OSD_FSID.json.adopted-by-cephadm
exists, which seems pretty safe IMO.  My only concern with this method is:
what happens if someone adopts a simple OSD, then later wants to migrate it
to LVM.  Presumably that's a destroy and recreate, keeping the same OSD ID?
If that's true, then the JSON file probably still exists, so the subsequent
create will do the wrong thing, i.e. will add `chown` calls, not `ceph-volume
lvm activate` calls.  Any/all feedback appreciated...

Fixes: https://tracker.ceph.com/issues/46833
Signed-off-by: Tim Serong <tserong@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants