ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp by jlebon · Pull Request #2181 · coreos/fedora-coreos-config

jlebon · 2023-01-18T23:47:29Z

We're hitting an issue right now where
coreos-ignition-unique-boot.service (backed by rdcore) is failing on multipath with:

Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/sdb3", "/dev/mapper/mpatha3"]

The unique label detection code in rdcore determines whether multiple lower-level devices actually refer to the same higher-level device (e.g. multipath or RAID1) by looking at the filesystem UUID. It uses blkid to query device UUIDs.

libblkid maintains a cache of devices to avoid reprobing all devices all the time. This cache normally gets updated (I think via udev, but I'm not sure) when changes occur. But something changed recently at least in the multipath case where the cache is only updated for the multipathed device, but not the underlying backing paths.

This then leads rdcore to think that they're separate devices. We probably should make rdcore smarter here in how it handles multipath devices, but still we don't want to have this stale cache around for the sake of other tools relying on it.

We started hitting this more frequently starting with kernel v6.0.17, but the issue triggers equally as easily on v6.0.16 when reproduced artificially. So I think we've just been lucky so far that this hasn't bit us (possibly we raced with another service that helped refresh the cache).

There's likely a bug here either in the kernel, or multipath or blkid. This is tracked by https://bugzilla.redhat.com/show_bug.cgi?id=2162151. Until then, nuke the blkid cache to force a reprobe on the next call.

Closes: coreos/fedora-coreos-tracker#1373

We're hitting an issue right now where `coreos-ignition-unique-boot.service` (backed by `rdcore`) is failing on multipath with: ``` Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/sdb3", "/dev/mapper/mpatha3"] ``` The unique label detection code in `rdcore` determines whether multiple lower-level devices actually refer to the same higher-level device (e.g. multipath or RAID1) by looking at the filesystem UUID. It uses blkid to query device UUIDs. libblkid maintains a cache of devices to avoid reprobing all devices all the time. This cache normally gets updated (I *think* via udev, but I'm not sure) when changes occur. But something changed recently at least in the multipath case where the cache is only updated for the multipathed device, but not the underlying backing paths. This then leads `rdcore` to think that they're separate devices. We probably should make `rdcore` smarter here in how it handles multipath devices, but still we don't want to have this stale cache around for the sake of other tools relying on it. We started hitting this more frequently starting with kernel v6.0.17, but the issue triggers equally as easily on v6.0.16 when reproduced artificially. So I think we've just been lucky so far that this hasn't bit us (possibly we raced with another service that helped refresh the cache). There's likely a bug here either in the kernel, or multipath or blkid. This is tracked by https://bugzilla.redhat.com/show_bug.cgi?id=2162151. Until then, nuke the blkid cache to force a reprobe on the next call. Closes: coreos/fedora-coreos-tracker#1373

dustymabe · 2023-01-19T01:07:52Z

overlay.d/05core/usr/lib/dracut/modules.d/40ignition-ostree/ignition-ostree-firstboot-uuid

+  # Workaround for https://bugzilla.redhat.com/show_bug.cgi?id=2162151.
+  # We nuke the blkid cache containing stale UUIDs so that future blkid calls
+  # (or tools leveraging libblkid) will be forced to re-probe.
+  rm -rf /run/blkid


the only concern I have here is this racing with something else accessing the data at the same time, but I don't think it's enough to block anything here as this clearly is solving a problem that we're hitting in the pipeline multiple times a day.

I don't think that's a concern. Checking the code quickly, it just does an open to open the cache file. If we delete it before, it will treat it as a cache miss and rebuild. If we delete it after, it'll still have an fd to the file.

dustymabe

LGTM

dustymabe · 2023-01-19T01:08:20Z

Thank you for working on this @jlebon

karelzak · 2023-01-19T17:56:14Z

Don't use cached blkid at all. If you really cannot use lsblk and you really need to read the device then use blkid -p to avoid the cache.

jlebon · 2023-01-19T18:09:18Z

Don't use cached blkid at all. If you really cannot use lsblk and you really need to read the device then use blkid -p to avoid the cache.

Ack, thanks for the info. I think there were instances in the past where lsblk wouldn't return information that blkid would (e.g. coreos/coreos-installer#813), which is why we changed to use blkid. But now I wonder if what happened there was that the device became inaccessible somehow and blkid worked because it was returning cached data. We'll have to revisit that.

karelzak · 2023-01-19T18:39:44Z

lsblk uses udev DB as a primary source; if udev is unavailable, it falls back to blkid (without cache), but if have a bad experience with lsblk then using blkid is fine, but always with `-p' ;-)

By default, `blkid` will return cached data, which we don't want because it might be stale. Add `-p` to make sure we always bypass the cache. Some of these callsites could probably be changed to use `lsblk`, which uses the udev database, but it's safer to keep using `blkid`. See also: coreos#2181 (comment)

By default, `blkid` will return cached data, which we don't want because it might be stale. Add `-p` to make sure we always bypass the cache. We originally used `lsblk` here which uses the udev database, but this was changed in coreos#813 because it failed to return the filesystem label in some instances. It might be worth revisiting this at some point and find out if we were just missing a `udevadm settle` somewhere. See also: coreos/fedora-coreos-config#2181 (comment)

jlebon · 2023-01-19T20:08:03Z

We should be able to revert this once coreos/coreos-installer#1094 gets into FCOS.

…D restamp" This reverts commit e41fd27. We shouldn't need this anymore now that we don't rely on the cache: - coreos#2184 - coreos/coreos-installer#1094 See also: coreos#2181

By default, `blkid` will return cached data, which we don't want because it might be stale. Add `-p` to make sure we always bypass the cache. Some of these callsites could probably be changed to use `lsblk`, which uses the udev database, but it's safer to keep using `blkid`. See also: #2181 (comment)

By default, `blkid` will return cached data, which we don't want because it might be stale. We need to use`-p` to make sure it directly probes the block devices and bypasses the cache. With `-p`, `blkid` requires passing the devices directly. Call it once to gather the list of devices (we trust the cache enough for this) and then again with `-p`. We originally used `lsblk` here which uses the udev database, but this was changed in coreos#813 because it failed to return the filesystem label in some instances. It might be worth revisiting this at some point and find out if we were just missing a `udevadm settle` somewhere. See also: coreos/fedora-coreos-config#2181 (comment)

jlebon · 2023-01-20T18:31:39Z

lsblk uses udev DB as a primary source; if udev is unavailable, it falls back to blkid (without cache), but if have a bad experience with lsblk then using blkid is fine, but always with `-p' ;-)

Is there a way to force lsblk to not use udev if we suspect that we may be running in a scenario where the udev db is not reliable? I couldn't find any relevant option in lsblk(8). udevadm settle is not foolproof unfortunately and we've been bit by its inherent raciness in the past.

karelzak · 2023-01-30T12:42:49Z

Is there a way to force lsblk to not use udev if we suspect that we may be running in a scenario where the udev db is not reliable?

This is currently impossible, added to TODO: util-linux/util-linux#2047

…D restamp" This reverts commit e41fd27. We shouldn't need this anymore now that we don't rely on the cache: - #2184 - coreos/coreos-installer#1094 See also: #2181

…D restamp" This reverts commit e41fd27. We shouldn't need this anymore now that we don't rely on the cache: - coreos#2184 - coreos/coreos-installer#1094 See also: coreos#2181

By default, `blkid` will return cached data, which we don't want because it might be stale. Add `-p` to make sure we always bypass the cache. Some of these callsites could probably be changed to use `lsblk`, which uses the udev database, but it's safer to keep using `blkid`. See also: coreos#2181 (comment)

…D restamp" This reverts commit e41fd27. We shouldn't need this anymore now that we don't rely on the cache: - coreos#2184 - coreos/coreos-installer#1094 See also: coreos#2181

By default, `blkid` will return cached data, which we don't want because it might be stale. Add `-p` to make sure we always bypass the cache. Some of these callsites could probably be changed to use `lsblk`, which uses the udev database, but it's safer to keep using `blkid`. See also: coreos#2181 (comment)

…D restamp" This reverts commit e41fd27. We shouldn't need this anymore now that we don't rely on the cache: - coreos#2184 - coreos/coreos-installer#1094 See also: coreos#2181

jlebon mentioned this pull request Jan 18, 2023

Test iso-offline-install on multipath on ppc64le and aarch64 failing coreos-ignition-unique-boot.service check coreos/fedora-coreos-tracker#1373

Closed

dustymabe reviewed Jan 19, 2023

View reviewed changes

dustymabe approved these changes Jan 19, 2023

View reviewed changes

dustymabe merged commit e41fd27 into coreos:testing-devel Jan 19, 2023

jlebon mentioned this pull request Jan 19, 2023

overlay: always use -p with blkid #2184

Merged

jlebon mentioned this pull request Jan 19, 2023

blockdev: use -p when calling blkid coreos/coreos-installer#1094

Merged

jlebon mentioned this pull request Jan 19, 2023

Revert "ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp" #2187

Merged

karelzak mentioned this pull request Jan 30, 2023

Add --properties-by={udev,blkid,file} to lsblk util-linux/util-linux#2047

Closed

jlebon deleted the pr/blkid-cache branch April 23, 2023 23:28

jlebon mentioned this pull request Jul 16, 2024

install: Use sfdisk, not lsblk bootc-dev/bootc#688

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp#2181

ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp#2181
dustymabe merged 1 commit intocoreos:testing-develfrom
jlebon:pr/blkid-cache

jlebon commented Jan 18, 2023

Uh oh!

dustymabe Jan 19, 2023

Uh oh!

jlebon Jan 19, 2023

Uh oh!

dustymabe left a comment

Uh oh!

dustymabe commented Jan 19, 2023

Uh oh!

karelzak commented Jan 19, 2023

Uh oh!

jlebon commented Jan 19, 2023

Uh oh!

karelzak commented Jan 19, 2023

Uh oh!

jlebon commented Jan 19, 2023

Uh oh!

jlebon commented Jan 20, 2023

Uh oh!

karelzak commented Jan 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jlebon commented Jan 18, 2023

Uh oh!

dustymabe Jan 19, 2023

Choose a reason for hiding this comment

Uh oh!

jlebon Jan 19, 2023

Choose a reason for hiding this comment

Uh oh!

dustymabe left a comment

Choose a reason for hiding this comment

Uh oh!

dustymabe commented Jan 19, 2023

Uh oh!

karelzak commented Jan 19, 2023

Uh oh!

jlebon commented Jan 19, 2023

Uh oh!

karelzak commented Jan 19, 2023

Uh oh!

jlebon commented Jan 19, 2023

Uh oh!

jlebon commented Jan 20, 2023

Uh oh!

karelzak commented Jan 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants