ceph-volume add device_id to inventory listing#25201
ceph-volume add device_id to inventory listing#25201alfredodeza merged 3 commits intoceph:masterfrom
Conversation
This is hopefully more stable then using ID_SERIAL. While often ID_SERIAL contains the model name and serial, in some cases it only contains something close to the actual serial number. Signed-off-by: Jan Fajerski <jfajerski@suse.com>
d46eec5 to
56b3186
Compare
| output['lvs'] = [lv.report() for lv in self.lvs] | ||
| return output | ||
|
|
||
| def _get_device_id(self): |
There was a problem hiding this comment.
I would suggest adding a comment explaining the fact we mimic the blkdev implementation so IDs collected in the inventory can be used in inside ceph.
There was a problem hiding this comment.
Right...lets see where this is going in regards to duplicate implementations.
| Optionally pass a list of properties to return. A requested property might | ||
| not be returned if not present. | ||
| """ | ||
| cmd = ['udevadm', 'info', '--query=property', device] |
There was a problem hiding this comment.
Something good to keep in mind for containers, this will only work if the container is privileged (--privileged).
There was a problem hiding this comment.
Actually not really, my container setup (highly privileged) returns only a few fields and not the one you're looking for:
[root@60a8b848d66e /]# udevadm info --query=property /dev/sdb
DEVNAME=/dev/sdb
DEVPATH=/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host0/target0:1:0/0:1:0:1/block/sdb
DEVTYPE=disk
MAJOR=8
MINOR=16
SUBSYSTEM=block
There was a problem hiding this comment.
Ok answering my own question :)
We must bindmount /var/run/udev/ to get what we want, see -v /var/run/udev/:/var/run/udev/
There was a problem hiding this comment.
Right. I kept the implementation close to what c++ land was doing. I think something based on /sys/block/$dev/ would also work.
Also afaiu its yet unclear if and when we can run Ceph in unpreviliged containers.
There was a problem hiding this comment.
I don't think we can achieve non-privileged containers yet, but if we can retry the number of bindmount to apply by doing a "walkpath" in /sys approach that'd be easier for everyone using containers I believe :).
There was a problem hiding this comment.
ceph-volume has to require super user permissions in a ton of calls (look for @needs_root decorator in class methods). Not only because LVM requires it, but also because there are a lot of utilities that need it. I doubt this can be simplified for only a subset of paths
There was a problem hiding this comment.
IMO we should rely on udev here. Getting the same info ourselves out of /sys is complex and error-prone--that's why we are using udev in the first place.
There was a problem hiding this comment.
I have put in a fair bit of thought to running Ceph without privileged containers, and my thoughts have been that in the short-term, privileged containers are the best option for getting main functionality in place initially. Once things are working well, paring down privileges in containers are still extremely valuable optimizations to make.
Thoughts I have around OSDs in Rook are that the relatively short-lived rook-ceph-osd-prepare-... pods in rook are at less risk for security concerns due to privilege than are the constantly-running rook-ceph-osd-<id> pods. If ceph-volume can be limited to the prepare pod (which I believe is the intent), focusing effort on reducing permissions in the osd pods themselves is the more important place to focus on privilege needs. OSD daemons also likely represent the largest percentage of running Ceph daemons by far, so to my reasoning, security enhancements to the OSD daemons have the biggest return on investment for developer time spent.
A related writeup of my now probably outdated investigation as pertains to Rook is here: rook/rook#1944
| out, _err, _rc = process.call(cmd) | ||
| ret = {} | ||
| for line in out: | ||
| p, v = line.split('=', 1) |
There was a problem hiding this comment.
this should catch a TypeError as that can be the case when the split doesn't have the two items being assigned here.
I would recommend splitting the system call with a parser, similar to what we've done with blkid and lsblk because it is easier to test the parsing vs. mixing it with the system call (this PR needs a few of those to ensure that this function is working as expected).
Curious that you have implemented this, I'm working on removing partitions as part of lvm zap which would need udevadm, here is where I left that implementation at:
def udevadm_info(device):
"""
Capture device information with udevadm, useful in situations where blkid and lsblk aren't
producing enough useful output.
"""
command = ['udevadm', 'info', '--query=property', '--name', device]
out, err, rc = process.call(command)
return _udevadm_parser(out)
def _udevadm_parser(output):
"""
Parses the output from a udevadm call, requires output to be
produced using the ``info``, specifying the actual --query type, to remove prefixes in the output.
Expected output::
# udevadm info --query=property --name /dev/sdb
DEVLINKS=/dev/disk/by-path/pci-0000:00:14.0-scsi-0:0:3:0
DEVNAME=/dev/sdb
DEVPATH=/devices/pci0000:00/0000:00:14.0/host3/target3:0:3/3:0:3:0/block/sdb
DEVTYPE=disk
ID_BUS=scsi
ID_MODEL=HARDDISK
ID_MODEL_ENC=HARDDISK\x20\x20\x20\x20\x20\x20\x20\x20
ID_PART_TABLE_TYPE=gpt
ID_PART_TABLE_UUID=bea7a874-c406-46b2-98bd-414f062f2eb5
ID_PATH=pci-0000:00:14.0-scsi-0:0:3:0
ID_PATH_TAG=pci-0000_00_14_0-scsi-0_0_3_0
ID_REVISION=1.0
ID_SCSI=1
ID_TYPE=disk
ID_VENDOR=VBOX
ID_VENDOR_ENC=VBOX\x20\x20\x20\x20
MAJOR=8
...
"""
processed = {}
for line in output:
try:
key, value = line.split('=')
except ValueError:
continue
processed[key] = value
return processed
There was a problem hiding this comment.
I would strongly argue against catching an error here.
- It's highly unlikely the output of udevadm is ever going to change.
- You propose to silently eat an error in the udevadm output. What is the benefit of that? Fail early is imho much better here as this is an error (udevadm giving output that is not dividable at a
=) that seems impossible to recover from. Catching exceptions is meant for error handling, not error hiding.
There was a problem hiding this comment.
The output of udevadm might not change at all for a while, but the parser must be able to withstand lines that don't comply exactly to what it needs.
There is no actual error eating that is silent here, as all system calls that fail are logged to the configured file (would be caused by a non-zero exit status). This is one of the benefits of implementing a parser separately: it doesn't care about error handling from a non-zero exit status, makes testing far easier (just a list of strings is needed).
If we wanted to fail hard when udevadm is not giving us exactly what we need because of incomplete/invalid data or a non-zero exit status we could improve the handling in the function that does the system call.
My example is crufty (work in progress) and you are catching nits that need to be improved for sure! The main idea is to separate the concerns for easier/better unit testing.
There was a problem hiding this comment.
My 2 cents:
- My reading of the code posted by @alfredodeza continuing after
ValueErroris that I would expect that code to silently eat the failure if the output changes, which also seems a bad idea to me. If that is not the actual behavior of the code, then it is obscured from me, and I believe that section can be improved to be more clear about the real-world behavior. - Separation of concerns (i.e., parsing output separate from calling udevadm) does seem like a good thing to do for unit testing, and I like that the expected output is spelled out clearly in Alfredo's example. It may be verbose, but as a newcomer to this codebase, it's incredibly helpful to me.
- Allowing the parser to withstand lines that don't exactly match the expected input seems a double-edged sword to me. It may be able to continue functioning in the case of some output change, and it also may begin to function erroneously in an unexpected way in such a case. In my opinion, matching the line exactly is a better solution, as it will be much easier to debug an error at the location of the parser failure than to debug an error which is caused later in the program's lifetime due to an invalid value being parsed and passed on.
There was a problem hiding this comment.
@BlaineEXE I think we are in agreement :) Like I mentioned, the patch should not be taken as final/production quality because it is a work-in-progress on another branch that hasn't gone out for review.
I agree that parsing shouldn't eat things up (although failed udev calls are logged), and logging them sounds reasonable.
In short, my ask is to separate concerns for better unit testing, just like the other system calls/functions that exist in that module.
There was a problem hiding this comment.
The output of udevadm might not change at all for a while, but the parser must be able to withstand lines that don't comply exactly to what it needs.
There is no actual error eating that is silent here, as all system calls that fail are logged to the configured file (would be caused by a non-zero exit status). This is one of the benefits of implementing a parser separately: it doesn't care about error handling from a non-zero exit status, makes testing far easier (just a list of strings is needed).
@alfredodeza We are starting to mix issues here. A non-zero exit status is (and should not) be handled here. That is what process.call does. The issue here is udevadm returning something other then $KEY=$VALUE with a zero exit status. My code will raise a ValueError (in case split returns a list of length 1...i.e. the output has no =) and I think its bad to catch this and not handle the error. If we could handle the error so the code can continue I'd be all for it, but I don't see a way to handle an unknown udevadm return format.
If we wanted to fail hard when udevadm is not giving us exactly what we need because of incomplete/invalid data or a non-zero exit status we could improve the handling in the function that does the system call.
How and why would process.call care about the output format of udevadm? We could add a function parameter to call so it does call an injected parse method, but that would not change the fact, that we can not recover from udevadm returning an unknown format.
@BlaineEXE sees my point I think:
Allowing the parser to withstand lines that don't exactly match the expected input seems a double-edged sword to me. It may be able to continue functioning in the case of some output change, and it also may begin to function erroneously in an unexpected way in such a case. In my opinion, matching the line exactly is a better solution, as it will be much easier to debug an error at the location of the parser failure than to debug an error which is caused later in the program's lifetime due to an invalid value being parsed and passed on.
There was a problem hiding this comment.
Just split it into two parts: a parser, and a CLI handler, like the rest of the module. This is easier to test. I'll leave the other details on error handling to you, happy to review that.
Seems like I caused more confusion with my wip example than it help clarify intent. Sorry about that
There was a problem hiding this comment.
Oh certainly, a function split makes sense. I was really just arguing about error handling. And no worries, I like a good techinical discussion.
|
@wjwithagen any new info from the FreeBSD side? |
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>
|
We'll end up exposing / to get the whole thing working. That starts to be worrying. Containers were made to run simple tasks and we endup with a nearly full host exposing :( |
Yeah, Containers were not designed with OSDs in mind. |
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com>
This intends to mimic the C++ implementation in src/common/blkdev.cc. Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
56b3186 to
58316e3
Compare
|
@Sage @ErwanAliasr1 @leseb @alfredodeza I think we can go ahead with this if everybody is on board to switch to using ID_MODEL + _ + ID_SERIAL_SHORT instead of only ID_SERIAL. The python implementation returns an empty string on FreeBSD but it seems like ceph-volume isn't fully functional there anyway. |
|
Looks good to me. Thx for your work.
Le ven. 30 nov. 2018 à 16:29, Jan Fajerski <notifications@github.com> a
écrit :
… @Sage <https://github.com/Sage> @ErwanAliasr1
<https://github.com/ErwanAliasr1> @leseb <https://github.com/leseb>
@alfredodeza <https://github.com/alfredodeza> I think we can go ahead
with this if everybody is on board to switch to using ID_MODEL + _ +
ID_SERIAL_SHORT instead of only ID_SERIAL. The python implementation
returns an empty string on FreeBSD but it seems like ceph-volume isn't
fully functional there anyway.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#25201 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AEVEHfmgWkcBL4BUH3eUXKtS5UA0qvGUks5u0U7wgaJpZM4YtD2w>
.
|
|
@jan--f one last question (the PR looks fine), we have been backporting ceph-volume PRs all the way back to Luminous. Is the C++ change able to be backported to Luminous along with this PR? It might get complicated to get this branch in Luminous otherwise |
|
On Fri, 30 Nov 2018, Alfredo Deza wrote:
@jan--f one last question (the PR looks fine), we have been backporting ceph-volume PRs all the way back to Luminous. Is the C++ change able to be backported to Luminous along with this PR?
It might get complicated to get this branch in Luminous otherwise
Nope... the device_id code is new in nautilus. As long as it's in a
separate commit message it should be trivial to skip that patch!
|
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit 997667a)
In order to be able to retrieve udev information, we must expose its socket. As per, ceph/ceph#25201 ceph-volume will start consuming udev output. Signed-off-by: Sébastien Han <seb@redhat.com> (cherry picked from commit 997667a)
This PR changes the device_id implementation from using
ID_SERIALtoID_MODEL + _ + ID_SERIAL_SHORT(1) and adds an equivalent python implementation to ceph-volume (2).The reasoning is that the new implementation seem more reliable.
ID_SERIALoften contains the model and serial, but sometimes only the serial.The python implementation is not equivalent yet for the FreeBSD case. In C++ this uses sysfs and
ioctlwith a (Free?)BSD specific constant. I'm not sure yet of the same call can be made with python'sioctlimplementation. If so this should also be a RFC if we should move this to python/ceph-volume and call it from C++ to avoid the duplicate implementation.Fixes: https://tracker.ceph.com/issues/37083