Skip to content

mgr/rook: implement orch device zap in rook orchestrator#43138

Merged
liewegas merged 1 commit intoceph:masterfrom
josephsawaya:wip-mgr-rook-device-zap
Oct 20, 2021
Merged

mgr/rook: implement orch device zap in rook orchestrator#43138
liewegas merged 1 commit intoceph:masterfrom
josephsawaya:wip-mgr-rook-device-zap

Conversation

@josephsawaya
Copy link

This PR implements orch device zap by creating a Batch job that
creates a pod on the target host that mounts the /dev directory and
runs ceph-volume lvm zap --destroy <path>

Signed-off-by: Joseph Sawaya jsawaya@redhat.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to address raw OSDs.

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

)
)
)
self.batchV1_api.create_namespaced_job('rook-ceph', body)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no err/status returned?

@josephsawaya
Copy link
Author

You also need to address raw OSDs.

From what I can tell ceph-volume lvm zap is the best option right now, there is no ceph-volume raw zap and it doesn't seem like there's any specific command for zapping raw devices specifically in ceph-volume

@liewegas
Copy link
Member

You also need to address raw OSDs.

From what I can tell ceph-volume lvm zap is the best option right now, there is no ceph-volume raw zap and it doesn't seem like there's any specific command for zapping raw devices specifically in ceph-volume

sort out zapping raw devices from ceph-volume is on the todo list!

@leseb
Copy link
Member

leseb commented Sep 14, 2021

You also need to address raw OSDs.

From what I can tell ceph-volume lvm zap is the best option right now, there is no ceph-volume raw zap and it doesn't seem like there's any specific command for zapping raw devices specifically in ceph-volume

sort out zapping raw devices from ceph-volume is on the todo list!

Yeah but for now we could just dd the first few MiB of the disk if it's a raw OSD so that the PR is complete. Up to you.

@jmolmo
Copy link
Member

jmolmo commented Sep 14, 2021

Several doubts here, probably is just that for me is not clear the roadmap ahead...... for the moment , we only can create OSDs using PVs, and we do not have information about physical details of the devices, and we do not now if an specific device is used in a PV and for several PVCs.
It is ok to manage PVs (and do not show any device information) in commands like "ceph orch device ls" and at the same time provide a command to destroy a physical device? It is ok to launch the job without any verification about the use of the device?

@jmolmo
Copy link
Member

jmolmo commented Sep 14, 2021

I think now i understand ... this is just to make possible the theutology test:
#43139
... but probably highly dangerous outside theutology

@josephsawaya josephsawaya force-pushed the wip-mgr-rook-device-zap branch from 4dda0b1 to 30342b3 Compare September 14, 2021 13:47
@josephsawaya
Copy link
Author

I think now i understand ... this is just to make possible the theutology test:
#43139
... but probably highly dangerous outside theutology

Yeah for now it looks like this should just be used in teuthology.

You also need to address raw OSDs.

From what I can tell ceph-volume lvm zap is the best option right now, there is no ceph-volume raw zap and it doesn't seem like there's any specific command for zapping raw devices specifically in ceph-volume

sort out zapping raw devices from ceph-volume is on the todo list!

Yeah but for now we could just dd the first few MiB of the disk if it's a raw OSD so that the PR is complete. Up to you.

I just went ahead and changed it to use dd in the case of a raw device

@liewegas
Copy link
Member

jenkins test make check

self.rook_cluster.create_zap_job(host, path)
except Exception as e:
logging.error(e)
return OrchResult(None, "Unable to zap device" + str(e.with_traceback(None)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return OrchResult(None, "Unable to zap device" + str(e.with_traceback(None)))
return OrchResult(None, "Unable to zap device " + str(e.with_traceback(None)))

Or is OrchResult printing nicely already?

name="device-zap",
image="rook/ceph:master",
command=["bash"],
args=["-c", f"ceph-volume raw list | grep {path} && dd if=/dev/zero of=\"{path}\" bs=1M count=100 oflag=direct,dsync || ceph-volume lvm zap --destroy {path}"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args=["-c", f"ceph-volume raw list | grep {path} && dd if=/dev/zero of=\"{path}\" bs=1M count=100 oflag=direct,dsync || ceph-volume lvm zap --destroy {path}"],
args=["-c", f"ceph-volume raw list {path} && dd if=/dev/zero of=\"{path}\" bs=1M count=100 oflag=direct,dsync || ceph-volume lvm zap --destroy {path}"],

This should be the same. IIRC the list supports arguments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think count=1 is probably more than enough since the bluestore signature is 60 bytes only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, lvm zap supports OSD FSID and/or OSD ID as a parameter, I wonder if it's safer to use instead of passing the disk, just a thought.

@liewegas
Copy link
Member

rook/__init__.py:5: note: In module imported here:
rook/module.py: note: In member "zap_device" of class "RookOrchestrator":
rook/module.py:452: error: Argument 2 to "OrchResult" has incompatible type "str"; expected "Optional[Exception]"

@josephsawaya josephsawaya force-pushed the wip-mgr-rook-device-zap branch from 2840b01 to d4107b6 Compare October 14, 2021 21:35
@liewegas
Copy link
Member

jenkins test make check

)
ret.append(spec)
return ret
def create_zap_job(self, host: str, path: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing newline above

This commit implements orch device zap by creating a pod on the target
host that mounts the /dev directory and runs either overwrites the first
few blocks of the device with zeros if it's a raw device or if it's not
a raw device it will use `ceph-volume lvm zap`.

Signed-off-by: Joseph Sawaya <jsawaya@redhat.com>
@josephsawaya josephsawaya force-pushed the wip-mgr-rook-device-zap branch from d4107b6 to 22631ec Compare October 18, 2021 19:06
@liewegas
Copy link
Member

jenkins test api

@sebastian-philipp
Copy link
Contributor

jenkins test api

@liewegas liewegas merged commit 78f8f78 into ceph:master Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants