os/bluestore: Multiple bdev labels on main block device#55374
os/bluestore: Multiple bdev labels on main block device#55374
Conversation
src/os/bluestore/BlueStore.h
Outdated
| int _check_or_set_bdev_label(std::string path, uint64_t size, std::string desc, | ||
| bool create); | ||
| int _check_or_set_main_bdev_label( | ||
| std::string path, |
There was a problem hiding this comment.
| std::string path, | |
| std::string& path, |
| if (bdev_label_valid_locations.empty()) { | ||
| _read_main_bdev_label(cct, p, &bdev_label, | ||
| &bdev_label_valid_locations, &bdev_label_multi, &bdev_label_epoch); | ||
| } | ||
| if (!bdev_label_valid_locations.empty()) { | ||
| bdev_label.meta[key] = value; | ||
| if (bdev_label_multi) { | ||
| bdev_label_epoch++; | ||
| bdev_label.meta["epoch"] = std::to_string(bdev_label_epoch); | ||
| } | ||
| int r = _write_bdev_label(cct, p, bdev_label, bdev_label_valid_locations); | ||
| ceph_assert(r == 0); | ||
| } | ||
| label.meta[key] = value; | ||
| r = _write_bdev_label(cct, p, label); | ||
| ceph_assert(r == 0); | ||
| return ObjectStore::write_meta(key, value); | ||
| } |
There was a problem hiding this comment.
What if bdev_label_valid_locations.empty() == true?
There was a problem hiding this comment.
We need to skip writing to bdev label if no bdev label was available, because we write "type=bluestore" in mkfs before creation of bdev label.
Granted, its weird.
|
jenkins test make check |
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
5d754bc to
a0530b1
Compare
a0530b1 to
b5356ba
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
| } | ||
| } | ||
| // Mark bits or locations of all bdev labels. | ||
| for (size_t i = 0; i < bdev_label_positions.size(); i++) { |
There was a problem hiding this comment.
shouldn't this go over bdev_labels_in_repair instead?
There was a problem hiding this comment.
No, we can have bdev labels locations that have proper data, but are in collision with some object.
There was a problem hiding this comment.
well, the above comment is a bit confusing then - in fact you mark all the possible bdev locations here. Irrespective of their bluefs usage.
There was a problem hiding this comment.
I have completely different comment here. Lets check after push.
df3e95d to
06bfcba
Compare
|
jenkins test make check |
603ec0e to
aad1fb8
Compare
|
This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved |
4ea3ecb to
811f297
Compare
Signed-off-by: Adam Kupczyk <akupczyk@ibm.com>
|
jenkins test this please |
|
@aclamk this was tested and approved see: https://tracker.ceph.com/issues/67266 |
|
jenkins test api |
2 similar comments
|
jenkins test api |
|
jenkins test api |
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com> (cherry picked from commit dcf7439)
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
BlueStore now writes its metadata at multiple offset on devices [1]. It means `ceph-volume lvm zap` doesn't remove BlueStore signature altogether. This can confuse ceph-volume when redeploying an OSD on a previously zapped device because there is still old BlueStore metadata on it. ceph-volume should call `ceph-bluestore-tool zap-device` [2] in addition to the existing calls when wiping a device. [1] ceph#55374 [2] ceph#59632 Fixes: https://tracker.ceph.com/issues/68035 Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
|
Hello! What is the proper way to remove these labels for ceph v20.2.0 and up? I'm especially targeting a ceph v18 -> v20 migration. I'm currently using this logic: This needs to work on both on VMs (ceph as a docker container) and on K8s clusters (ceph as a pod). |
Corruption of bdev label makes it very hard to recover OSD.
This PR copies bdev label to 4 potential replica places on device:
0(original), 1GB, 10GB, 100GB, 1000GB.
If all replicas do not match system refuses to start.
Fsck (repair mode) is the way to fix it.
This is an alternative version for #53095. Borrows many concepts and some code from it.
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e