debian: recursively adjust permissions of /var/lib/ceph/crash#55917
debian: recursively adjust permissions of /var/lib/ceph/crash#55917
Conversation
Ceph has a postinst hook that sets the ownership of '/var/lib/ceph/*' to ceph:ceph (in our case), but misses out on the contents of '/var/lib/ceph/crash'. This patch therefore also recursively updates the permissions of '/var/lib/ceph/crash'. The change was also proposed upstream [0]. [0]: ceph/ceph#55917 Signed-off-by: Max Carrara <m.carrara@proxmox.com> Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Ceph has a postinst hook that sets the ownership of '/var/lib/ceph/*' to ceph:ceph (in our case), but misses out on the contents of '/var/lib/ceph/crash'. This patch therefore also recursively updates the permissions of '/var/lib/ceph/crash'. The change was also proposed upstream [0]. [0]: ceph/ceph#55917 Signed-off-by: Max Carrara <m.carrara@proxmox.com>
Ceph has a postinst hook that sets the ownership of '/var/lib/ceph/*' to ceph:ceph (in our case), but misses out on the contents of '/var/lib/ceph/crash'. This patch therefore also recursively updates the permissions of '/var/lib/ceph/crash'. The change was also proposed upstream [0]. [0]: ceph/ceph#55917 Signed-off-by: Max Carrara <m.carrara@proxmox.com> (cherry picked from commit d52158f) Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
|
@tchaikov could this please be backported to at least Reef to be fixed with the next point release? |
I can also provide backports for Reef and Quincy if you don't mind - either once this gets merged or whenever you wish. |
@frittentheke hi Christian, i think so. since https://tracker.ceph.com/issues/64548 includes Reef in its "Backport" field. and sorry for the latency, gentlemen. i will try to squeeze more time for a proper review asap.
better off waiting until this PR gets merged, so that we can reference it with the correct sha1 in the backport PR. |
Thanks @tchaikov for the quick update! While we certainly adjusted this on our clusters already, I am just really afraid this causes less crash reports to come in from other installs out there. Less crash reports, means potentially more undiscovered bugs. Selfish me 🙈 |
| [ -x /sbin/start ] && start ceph-all || : | ||
|
|
||
| # adjust file and directory permissions | ||
| for DIR in /var/lib/ceph/* ; do |
There was a problem hiding this comment.
Why not just add the missing /var/lib/ceph/crash folder here? The new code treats this folder individually.
There was a problem hiding this comment.
IIUC, what we need to do is to fix the permissions of the directories listed by debian/ceph-base.dirs, right? if that's the case, i concur with @frittentheke , that probably we can just add var/lib/ceph/crash here, like
for DIR in /var/lib/ceph/* /var/lib/ceph/crash/posted; do
if ! dpkg-statoverride --list $DIR >/dev/null
then
chown $SERVER_USER:$SERVER_GROUP $DIR
fi
doneplease note, i am listing /var/lib/ceph/crash/posted here, as /var/lib/ceph/crash has been covered by /var/lib/ceph/*.
There was a problem hiding this comment.
Why not just add the missing
/var/lib/ceph/crashfolder here? The new code treats this folder individually.
That's actually a very fair point. In some iteration of this patch I ended up using the find+xargs approach, as that seemed just more logical at the time. (IIRC it was a globbing thing first that then became redundant anyway; the find+xargs combo remained anyway.)
for DIR in /var/lib/ceph/* /var/lib/ceph/crash/posted; do if ! dpkg-statoverride --list $DIR >/dev/null then chown $SERVER_USER:$SERVER_GROUP $DIR fi doneplease note, i am listing
/var/lib/ceph/crash/postedhere, as/var/lib/ceph/crashhas been covered by/var/lib/ceph/*.
While this does obviously chown the crash dir, it doesn't update the already existing entries in crash/posted that might've been created as root prior to #48713. That's why I chose to treat /var/lib/ceph/crash/ separately.
debian/ceph-base.postinst
Outdated
|
|
||
| # adjust file and directory permissions | ||
| find /var/lib/ceph -mindepth 1 -maxdepth 1 -print0 \ | ||
| | xargs -0 -I '{}' sh -c "${PERM_COMMAND}" |
There was a problem hiding this comment.
Why restrict find to maxdepth 1 here and not let it recurse to cover all the files?
Or maybe do a chown -r to ensure all dirs and files below are owned by cheph:ceph?
This avoids the need to deal with /var/lib/ceph/crash in yet another find + xargs.
There was a problem hiding this comment.
I wasn't 100% sure what's appropriate here - I simply wanted to preserve the original behavior, as I wasn't certain whether something deep inside /var/lib/ceph would perhaps require different ownership (at some point).
|
@tchaikov sorry to nag about this again, but I just noticed 18.2.3 QA phase already started (https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KRC33HNRBCZC5Z2GUDN5DQO43I2RJBJJ/). This PR is "just" about fixing some file permissions and we are likely discussion implementations styles. |
| [ -x /sbin/start ] && start ceph-all || : | ||
|
|
||
| # adjust file and directory permissions | ||
| for DIR in /var/lib/ceph/* ; do |
There was a problem hiding this comment.
IIUC, what we need to do is to fix the permissions of the directories listed by debian/ceph-base.dirs, right? if that's the case, i concur with @frittentheke , that probably we can just add var/lib/ceph/crash here, like
for DIR in /var/lib/ceph/* /var/lib/ceph/crash/posted; do
if ! dpkg-statoverride --list $DIR >/dev/null
then
chown $SERVER_USER:$SERVER_GROUP $DIR
fi
doneplease note, i am listing /var/lib/ceph/crash/posted here, as /var/lib/ceph/crash has been covered by /var/lib/ceph/*.
@frittentheke my apologies. indeed. |
No need @tchaikov ... I am still just a selfish dude looking for other installations' crash reports to make my Ceph installations more stable ;-) Speaking of Reef ... any remote chance this change can still make it into 18.2.3 ? |
|
I should actually add the milestone to the Reef backport when it's ready. Instead, I've added the "needs-reef-backport" label. |
|
@frittentheke @tchaikov Thanks for the reviews, much appreciated! I left some more comments inline. What's now obvious to me is that the original loop can just stay - the However, I'd personally suggest to do something like I'll update the PR accordingly, if you don't object. |
FWIW, I like the idea to "ensure" (and potentially fix) ownership also for existing files. |
|
@Aequitosh @tchaikov could I nag you once more to finalize the discussion on how to implement the |
@Aequitosh hi Max, please go ahead. i didn't take the "existing files" use case into consideration. but if that case exists, then, yeah, i concur with you. |
A rather recent PR made ceph-crash run as "ceph" user instead of root [0]. However, because /var/lib/ceph/crash/posted belongs to root, ceph-crash cannot actually post any crash logs now. This commit fixes this by recursively updating the permissions of '/var/lib/ceph/crash', which ensures that all files and directories used by 'ceph-crash.service' are actually owned by the user configured for Ceph. This also accounts for existing installations. Additionally, quote interpolated variables and use curly braces [1]. [0]: ceph#48713 [1]: https://www.shellcheck.net/wiki/SC2086 Fixes: https://tracker.ceph.com/issues/64548 Signed-off-by: Max Carrara <m.carrara@proxmox.com>
76df6f5 to
97a1ec4
Compare
|
Updated the PR. See the commit message for details. Basically did as I described, but also quoted variables where they're interpolated. Also added curly braces there. I highly doubt that globbing & word splitting are ever going to be issues, but still wanted to include it for good measure. (Though, should that not be desired, I can quickly update the PR again.) |
|
Is this fix on track to be merged? |
|
@frittentheke I'm not sure what happened with the testing. It will be included in the next main batch. |
debian: recursively adjust permissions of /var/lib/ceph/crash Reviewed-by: Pere Diaz Bou <pdiazbou@redhat.com> Reviewed-by: Kefu Chai <tchaikov@gmail.com>
A rather recent PR made
ceph-crashrun ascephuser instead of root1. However, because/var/lib/ceph/crash/postedbelongs to root,ceph-crashcannot actually post any crash logs now.This commit fixes this by recursively updating the permissions of
/var/lib/ceph/crash, which ensures that all files and directories used byceph-crash.serviceare actually owned by the user configured for Ceph.The previously existing loop has also been replaced by an invocation of
find | xargs.Fixes: https://tracker.ceph.com/issues/64548
Checklist
Further Notes
I'm not 100% sure if this is the best way to fix this issue, so please let me know if you prefer a different solution. This patch is originally from our mailing list2 - we've decided that it's what works best for us, due to
find | xargsbeing a bit more fine-grained than, let's say, something likefor FILE in /var/lib/ceph/**/*; do ...; done.Also, should this get merged (in whatever form), I'll also gladly provide backports for Reef and Quincy.
Footnotes
https://github.com/ceph/ceph/pull/48713 ↩
https://lists.proxmox.com/pipermail/pve-devel/2024-February/061803.html ↩