Skip to content

ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp#2181

Merged
dustymabe merged 1 commit intocoreos:testing-develfrom
jlebon:pr/blkid-cache
Jan 19, 2023
Merged

ignition-ostree-firstboot-uuid: nuke libblkid cache after UUID restamp#2181
dustymabe merged 1 commit intocoreos:testing-develfrom
jlebon:pr/blkid-cache

Conversation

@jlebon
Copy link
Member

@jlebon jlebon commented Jan 18, 2023

We're hitting an issue right now where
coreos-ignition-unique-boot.service (backed by rdcore) is failing on multipath with:

Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/sdb3", "/dev/mapper/mpatha3"]

The unique label detection code in rdcore determines whether multiple lower-level devices actually refer to the same higher-level device (e.g. multipath or RAID1) by looking at the filesystem UUID. It uses blkid to query device UUIDs.

libblkid maintains a cache of devices to avoid reprobing all devices all the time. This cache normally gets updated (I think via udev, but I'm not sure) when changes occur. But something changed recently at least in the multipath case where the cache is only updated for the multipathed device, but not the underlying backing paths.

This then leads rdcore to think that they're separate devices. We probably should make rdcore smarter here in how it handles multipath devices, but still we don't want to have this stale cache around for the sake of other tools relying on it.

We started hitting this more frequently starting with kernel v6.0.17, but the issue triggers equally as easily on v6.0.16 when reproduced artificially. So I think we've just been lucky so far that this hasn't bit us (possibly we raced with another service that helped refresh the cache).

There's likely a bug here either in the kernel, or multipath or blkid. This is tracked by https://bugzilla.redhat.com/show_bug.cgi?id=2162151. Until then, nuke the blkid cache to force a reprobe on the next call.

Closes: coreos/fedora-coreos-tracker#1373

We're hitting an issue right now where
`coreos-ignition-unique-boot.service` (backed by `rdcore`) is failing
on multipath with:

```
Error: System has 2 devices with a filesystem labeled 'boot': ["/dev/sdb3", "/dev/mapper/mpatha3"]
```

The unique label detection code in `rdcore` determines whether multiple
lower-level devices actually refer to the same higher-level device (e.g.
multipath or RAID1) by looking at the filesystem UUID. It uses blkid to
query device UUIDs.

libblkid maintains a cache of devices to avoid reprobing all devices
all the time. This cache normally gets updated (I *think* via udev,
but I'm not sure) when changes occur. But something changed recently
at least in the multipath case where the cache is only updated for the
multipathed device, but not the underlying backing paths.

This then leads `rdcore` to think that they're separate devices. We
probably should make `rdcore` smarter here in how it handles multipath
devices, but still we don't want to have this stale cache around for
the sake of other tools relying on it.

We started hitting this more frequently starting with kernel v6.0.17,
but the issue triggers equally as easily on v6.0.16 when reproduced
artificially. So I think we've just been lucky so far that this hasn't
bit us (possibly we raced with another service that helped refresh the
cache).

There's likely a bug here either in the kernel, or multipath or blkid.
This is tracked by https://bugzilla.redhat.com/show_bug.cgi?id=2162151.
Until then, nuke the blkid cache to force a reprobe on the next call.

Closes: coreos/fedora-coreos-tracker#1373
# Workaround for https://bugzilla.redhat.com/show_bug.cgi?id=2162151.
# We nuke the blkid cache containing stale UUIDs so that future blkid calls
# (or tools leveraging libblkid) will be forced to re-probe.
rm -rf /run/blkid
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only concern I have here is this racing with something else accessing the data at the same time, but I don't think it's enough to block anything here as this clearly is solving a problem that we're hitting in the pipeline multiple times a day.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's a concern. Checking the code quickly, it just does an open to open the cache file. If we delete it before, it will treat it as a cache miss and rebuild. If we delete it after, it'll still have an fd to the file.

Copy link
Member

@dustymabe dustymabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dustymabe
Copy link
Member

Thank you for working on this @jlebon

@dustymabe dustymabe merged commit e41fd27 into coreos:testing-devel Jan 19, 2023
@karelzak
Copy link

Don't use cached blkid at all. If you really cannot use lsblk and you really need to read the device then use blkid -p to avoid the cache.

@jlebon
Copy link
Member Author

jlebon commented Jan 19, 2023

Don't use cached blkid at all. If you really cannot use lsblk and you really need to read the device then use blkid -p to avoid the cache.

Ack, thanks for the info. I think there were instances in the past where lsblk wouldn't return information that blkid would (e.g. coreos/coreos-installer#813), which is why we changed to use blkid. But now I wonder if what happened there was that the device became inaccessible somehow and blkid worked because it was returning cached data. We'll have to revisit that.

@karelzak
Copy link

lsblk uses udev DB as a primary source; if udev is unavailable, it falls back to blkid (without cache), but if have a bad experience with lsblk then using blkid is fine, but always with `-p' ;-)

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jan 19, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache.
Some of these callsites could probably be changed to use `lsblk`, which
uses the udev database, but it's safer to keep using `blkid`.

See also: coreos#2181 (comment)
jlebon added a commit to jlebon/coreos-installer that referenced this pull request Jan 19, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache. We
originally used `lsblk` here which uses the udev database, but this was
changed in coreos#813 because
it failed to return the filesystem label in some instances. It might be
worth revisiting this at some point and find out if we were just missing
a `udevadm settle` somewhere.

See also: coreos/fedora-coreos-config#2181 (comment)
jlebon added a commit to jlebon/coreos-installer that referenced this pull request Jan 19, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache. We
originally used `lsblk` here which uses the udev database, but this was
changed in coreos#813 because
it failed to return the filesystem label in some instances. It might be
worth revisiting this at some point and find out if we were just missing
a `udevadm settle` somewhere.

See also: coreos/fedora-coreos-config#2181 (comment)
@jlebon
Copy link
Member Author

jlebon commented Jan 19, 2023

We should be able to revert this once coreos/coreos-installer#1094 gets into FCOS.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this pull request Jan 19, 2023
…D restamp"

This reverts commit e41fd27.

We shouldn't need this anymore now that we don't rely on the cache:
- coreos#2184
- coreos/coreos-installer#1094

See also: coreos#2181
dustymabe pushed a commit that referenced this pull request Jan 19, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache.
Some of these callsites could probably be changed to use `lsblk`, which
uses the udev database, but it's safer to keep using `blkid`.

See also: #2181 (comment)
jlebon added a commit to jlebon/coreos-installer that referenced this pull request Jan 20, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. We need to use`-p` to make sure it directly probes
the block devices and bypasses the cache.

With `-p`, `blkid` requires passing the devices directly. Call it once
to gather the list of devices (we trust the cache enough for this) and
then again with `-p`.

We originally used `lsblk` here which uses the udev database, but this
was changed in coreos#813
because it failed to return the filesystem label in some instances. It
might be worth revisiting this at some point and find out if we were
just missing a `udevadm settle` somewhere.

See also: coreos/fedora-coreos-config#2181 (comment)
@jlebon
Copy link
Member Author

jlebon commented Jan 20, 2023

lsblk uses udev DB as a primary source; if udev is unavailable, it falls back to blkid (without cache), but if have a bad experience with lsblk then using blkid is fine, but always with `-p' ;-)

Is there a way to force lsblk to not use udev if we suspect that we may be running in a scenario where the udev db is not reliable? I couldn't find any relevant option in lsblk(8). udevadm settle is not foolproof unfortunately and we've been bit by its inherent raciness in the past.

@karelzak
Copy link

Is there a way to force lsblk to not use udev if we suspect that we may be running in a scenario where the udev db is not reliable?

This is currently impossible, added to TODO: util-linux/util-linux#2047

jlebon added a commit that referenced this pull request Mar 31, 2023
…D restamp"

This reverts commit e41fd27.

We shouldn't need this anymore now that we don't rely on the cache:
- #2184
- coreos/coreos-installer#1094

See also: #2181
@jlebon jlebon deleted the pr/blkid-cache branch April 23, 2023 23:28
c4rt0 pushed a commit to c4rt0/fedora-coreos-config that referenced this pull request May 17, 2023
…D restamp"

This reverts commit e41fd27.

We shouldn't need this anymore now that we don't rely on the cache:
- coreos#2184
- coreos/coreos-installer#1094

See also: coreos#2181
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache.
Some of these callsites could probably be changed to use `lsblk`, which
uses the udev database, but it's safer to keep using `blkid`.

See also: coreos#2181 (comment)
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
…D restamp"

This reverts commit e41fd27.

We shouldn't need this anymore now that we don't rely on the cache:
- coreos#2184
- coreos/coreos-installer#1094

See also: coreos#2181
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
By default, `blkid` will return cached data, which we don't want because
it might be stale. Add `-p` to make sure we always bypass the cache.
Some of these callsites could probably be changed to use `lsblk`, which
uses the udev database, but it's safer to keep using `blkid`.

See also: coreos#2181 (comment)
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this pull request Oct 10, 2023
…D restamp"

This reverts commit e41fd27.

We shouldn't need this anymore now that we don't rely on the cache:
- coreos#2184
- coreos/coreos-installer#1094

See also: coreos#2181
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test iso-offline-install on multipath on ppc64le and aarch64 failing coreos-ignition-unique-boot.service check

3 participants