Skip to content

ceph-volume add device_id to inventory listing#25201

Merged
alfredodeza merged 3 commits intoceph:masterfrom
jan--f:c-v-inventory-add-device_id
Nov 30, 2018
Merged

ceph-volume add device_id to inventory listing#25201
alfredodeza merged 3 commits intoceph:masterfrom
jan--f:c-v-inventory-add-device_id

Conversation

@jan--f
Copy link
Contributor

@jan--f jan--f commented Nov 21, 2018

This PR changes the device_id implementation from using ID_SERIAL to ID_MODEL + _ + ID_SERIAL_SHORT (1) and adds an equivalent python implementation to ceph-volume (2).

  1. The reasoning is that the new implementation seem more reliable. ID_SERIAL often contains the model and serial, but sometimes only the serial.

  2. The python implementation is not equivalent yet for the FreeBSD case. In C++ this uses sysfs and ioctl with a (Free?)BSD specific constant. I'm not sure yet of the same call can be made with python's ioctl implementation. If so this should also be a RFC if we should move this to python/ceph-volume and call it from C++ to avoid the duplicate implementation.

Fixes: https://tracker.ceph.com/issues/37083

This is hopefully more stable then using ID_SERIAL. While often
ID_SERIAL contains the model name and serial, in some cases it only
contains something close to the actual serial number.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
@jan--f jan--f force-pushed the c-v-inventory-add-device_id branch from d46eec5 to 56b3186 Compare November 21, 2018 12:54
output['lvs'] = [lv.report() for lv in self.lvs]
return output

def _get_device_id(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding a comment explaining the fact we mimic the blkdev implementation so IDs collected in the inventory can be used in inside ceph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right...lets see where this is going in regards to duplicate implementations.

Optionally pass a list of properties to return. A requested property might
not be returned if not present.
"""
cmd = ['udevadm', 'info', '--query=property', device]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something good to keep in mind for containers, this will only work if the container is privileged (--privileged).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not really, my container setup (highly privileged) returns only a few fields and not the one you're looking for:

[root@60a8b848d66e /]# udevadm info --query=property /dev/sdb
DEVNAME=/dev/sdb
DEVPATH=/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host0/target0:1:0/0:1:0:1/block/sdb
DEVTYPE=disk
MAJOR=8
MINOR=16
SUBSYSTEM=block

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok answering my own question :)
We must bindmount /var/run/udev/ to get what we want, see -v /var/run/udev/:/var/run/udev/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I kept the implementation close to what c++ land was doing. I think something based on /sys/block/$dev/ would also work.

Also afaiu its yet unclear if and when we can run Ceph in unpreviliged containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can achieve non-privileged containers yet, but if we can retry the number of bindmount to apply by doing a "walkpath" in /sys approach that'd be easier for everyone using containers I believe :).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ceph-volume has to require super user permissions in a ton of calls (look for @needs_root decorator in class methods). Not only because LVM requires it, but also because there are a lot of utilities that need it. I doubt this can be simplified for only a subset of paths

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should rely on udev here. Getting the same info ourselves out of /sys is complex and error-prone--that's why we are using udev in the first place.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have put in a fair bit of thought to running Ceph without privileged containers, and my thoughts have been that in the short-term, privileged containers are the best option for getting main functionality in place initially. Once things are working well, paring down privileges in containers are still extremely valuable optimizations to make.

Thoughts I have around OSDs in Rook are that the relatively short-lived rook-ceph-osd-prepare-... pods in rook are at less risk for security concerns due to privilege than are the constantly-running rook-ceph-osd-<id> pods. If ceph-volume can be limited to the prepare pod (which I believe is the intent), focusing effort on reducing permissions in the osd pods themselves is the more important place to focus on privilege needs. OSD daemons also likely represent the largest percentage of running Ceph daemons by far, so to my reasoning, security enhancements to the OSD daemons have the biggest return on investment for developer time spent.

A related writeup of my now probably outdated investigation as pertains to Rook is here: rook/rook#1944

out, _err, _rc = process.call(cmd)
ret = {}
for line in out:
p, v = line.split('=', 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should catch a TypeError as that can be the case when the split doesn't have the two items being assigned here.

I would recommend splitting the system call with a parser, similar to what we've done with blkid and lsblk because it is easier to test the parsing vs. mixing it with the system call (this PR needs a few of those to ensure that this function is working as expected).

Curious that you have implemented this, I'm working on removing partitions as part of lvm zap which would need udevadm, here is where I left that implementation at:

def udevadm_info(device):
    """
    Capture device information with udevadm, useful in situations where blkid and lsblk aren't
    producing enough useful output.
    """
    command = ['udevadm', 'info', '--query=property', '--name', device]
    out, err, rc = process.call(command)
    return _udevadm_parser(out)


def _udevadm_parser(output):
    """
    Parses the output from a udevadm call, requires output to be
    produced using the ``info``, specifying the actual --query type, to remove prefixes in the output.

    Expected output::

        # udevadm info --query=property --name /dev/sdb
        DEVLINKS=/dev/disk/by-path/pci-0000:00:14.0-scsi-0:0:3:0
        DEVNAME=/dev/sdb
        DEVPATH=/devices/pci0000:00/0000:00:14.0/host3/target3:0:3/3:0:3:0/block/sdb
        DEVTYPE=disk
        ID_BUS=scsi
        ID_MODEL=HARDDISK
        ID_MODEL_ENC=HARDDISK\x20\x20\x20\x20\x20\x20\x20\x20
        ID_PART_TABLE_TYPE=gpt
        ID_PART_TABLE_UUID=bea7a874-c406-46b2-98bd-414f062f2eb5
        ID_PATH=pci-0000:00:14.0-scsi-0:0:3:0
        ID_PATH_TAG=pci-0000_00_14_0-scsi-0_0_3_0
        ID_REVISION=1.0
        ID_SCSI=1
        ID_TYPE=disk
        ID_VENDOR=VBOX
        ID_VENDOR_ENC=VBOX\x20\x20\x20\x20
        MAJOR=8
        ...
    """
    processed = {}
    for line in output:
        try:
            key, value = line.split('=')
        except ValueError:
            continue
        processed[key] = value
    return processed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would strongly argue against catching an error here.

  1. It's highly unlikely the output of udevadm is ever going to change.
  2. You propose to silently eat an error in the udevadm output. What is the benefit of that? Fail early is imho much better here as this is an error (udevadm giving output that is not dividable at a =) that seems impossible to recover from. Catching exceptions is meant for error handling, not error hiding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of udevadm might not change at all for a while, but the parser must be able to withstand lines that don't comply exactly to what it needs.

There is no actual error eating that is silent here, as all system calls that fail are logged to the configured file (would be caused by a non-zero exit status). This is one of the benefits of implementing a parser separately: it doesn't care about error handling from a non-zero exit status, makes testing far easier (just a list of strings is needed).

If we wanted to fail hard when udevadm is not giving us exactly what we need because of incomplete/invalid data or a non-zero exit status we could improve the handling in the function that does the system call.

My example is crufty (work in progress) and you are catching nits that need to be improved for sure! The main idea is to separate the concerns for easier/better unit testing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents:

  • My reading of the code posted by @alfredodeza continuing after ValueError is that I would expect that code to silently eat the failure if the output changes, which also seems a bad idea to me. If that is not the actual behavior of the code, then it is obscured from me, and I believe that section can be improved to be more clear about the real-world behavior.
  • Separation of concerns (i.e., parsing output separate from calling udevadm) does seem like a good thing to do for unit testing, and I like that the expected output is spelled out clearly in Alfredo's example. It may be verbose, but as a newcomer to this codebase, it's incredibly helpful to me.
  • Allowing the parser to withstand lines that don't exactly match the expected input seems a double-edged sword to me. It may be able to continue functioning in the case of some output change, and it also may begin to function erroneously in an unexpected way in such a case. In my opinion, matching the line exactly is a better solution, as it will be much easier to debug an error at the location of the parser failure than to debug an error which is caused later in the program's lifetime due to an invalid value being parsed and passed on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BlaineEXE I think we are in agreement :) Like I mentioned, the patch should not be taken as final/production quality because it is a work-in-progress on another branch that hasn't gone out for review.

I agree that parsing shouldn't eat things up (although failed udev calls are logged), and logging them sounds reasonable.

In short, my ask is to separate concerns for better unit testing, just like the other system calls/functions that exist in that module.

Copy link
Contributor Author

@jan--f jan--f Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output of udevadm might not change at all for a while, but the parser must be able to withstand lines that don't comply exactly to what it needs.

There is no actual error eating that is silent here, as all system calls that fail are logged to the configured file (would be caused by a non-zero exit status). This is one of the benefits of implementing a parser separately: it doesn't care about error handling from a non-zero exit status, makes testing far easier (just a list of strings is needed).

@alfredodeza We are starting to mix issues here. A non-zero exit status is (and should not) be handled here. That is what process.call does. The issue here is udevadm returning something other then $KEY=$VALUE with a zero exit status. My code will raise a ValueError (in case split returns a list of length 1...i.e. the output has no =) and I think its bad to catch this and not handle the error. If we could handle the error so the code can continue I'd be all for it, but I don't see a way to handle an unknown udevadm return format.

If we wanted to fail hard when udevadm is not giving us exactly what we need because of incomplete/invalid data or a non-zero exit status we could improve the handling in the function that does the system call.

How and why would process.call care about the output format of udevadm? We could add a function parameter to call so it does call an injected parse method, but that would not change the fact, that we can not recover from udevadm returning an unknown format.

@BlaineEXE sees my point I think:

Allowing the parser to withstand lines that don't exactly match the expected input seems a double-edged sword to me. It may be able to continue functioning in the case of some output change, and it also may begin to function erroneously in an unexpected way in such a case. In my opinion, matching the line exactly is a better solution, as it will be much easier to debug an error at the location of the parser failure than to debug an error which is caused later in the program's lifetime due to an invalid value being parsed and passed on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just split it into two parts: a parser, and a CLI handler, like the rest of the module. This is easier to test. I'll leave the other details on error handling to you, happy to review that.

Seems like I caused more confusion with my wip example than it help clarify intent. Sorry about that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh certainly, a function split makes sense. I was really just arguing about error handling. And no worries, I like a good techinical discussion.

@jan--f
Copy link
Contributor Author

jan--f commented Nov 22, 2018

@wjwithagen any new info from the FreeBSD side?

leseb added a commit to ceph/ceph-ansible that referenced this pull request Nov 26, 2018
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
leseb added a commit to ceph/ceph-ansible that referenced this pull request Nov 26, 2018
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
leseb added a commit to ceph/ceph-ansible that referenced this pull request Nov 26, 2018
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
leseb added a commit to ceph/ceph-ansible that referenced this pull request Nov 26, 2018
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
@ErwanAliasr1
Copy link
Contributor

We'll end up exposing / to get the whole thing working. That starts to be worrying. Containers were made to run simple tasks and we endup with a nearly full host exposing :(

@sebastian-philipp
Copy link
Contributor

Containers were made to run simple tasks and we endup with a nearly full host exposing :(

Yeah, Containers were not designed with OSDs in mind.

mergify bot pushed a commit to ceph/ceph-ansible that referenced this pull request Nov 26, 2018
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
Jan Fajerski added 2 commits November 28, 2018 10:21
This intends to mimic the C++ implementation in src/common/blkdev.cc.

Signed-off-by: Jan Fajerski <jfajerski@suse.com>
Signed-off-by: Jan Fajerski <jfajerski@suse.com>
@jan--f jan--f force-pushed the c-v-inventory-add-device_id branch from 56b3186 to 58316e3 Compare November 28, 2018 09:23
@jan--f jan--f changed the title [DNM] ceph-volume add device_id to inventory listing ceph-volume add device_id to inventory listing Nov 30, 2018
@jan--f jan--f removed the DNM label Nov 30, 2018
@jan--f
Copy link
Contributor Author

jan--f commented Nov 30, 2018

@Sage @ErwanAliasr1 @leseb @alfredodeza I think we can go ahead with this if everybody is on board to switch to using ID_MODEL + _ + ID_SERIAL_SHORT instead of only ID_SERIAL. The python implementation returns an empty string on FreeBSD but it seems like ceph-volume isn't fully functional there anyway.

Copy link
Member

@liewegas liewegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ErwanAliasr1
Copy link
Contributor

ErwanAliasr1 commented Nov 30, 2018 via email

Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID_ from udev is fine, thanks!

@alfredodeza
Copy link
Contributor

@jan--f one last question (the PR looks fine), we have been backporting ceph-volume PRs all the way back to Luminous. Is the C++ change able to be backported to Luminous along with this PR?

It might get complicated to get this branch in Luminous otherwise

@liewegas
Copy link
Member

liewegas commented Nov 30, 2018 via email

@alfredodeza alfredodeza merged commit b8a64b5 into ceph:master Nov 30, 2018
guits pushed a commit to ceph/ceph-ansible that referenced this pull request Feb 5, 2019
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 997667a)
mergify bot pushed a commit to ceph/ceph-ansible that referenced this pull request Feb 6, 2019
In order to be able to retrieve udev information, we must expose its
socket. As per, ceph/ceph#25201 ceph-volume will
start consuming udev output.

Signed-off-by: Sébastien Han <seb@redhat.com>
(cherry picked from commit 997667a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants