Skip to content

core: on switching root do not emit device state change based on enumeration results#12013

Merged
keszybz merged 4 commits intosystemd:masterfrom
yuwata:fix-switchroot-11997
Apr 2, 2019
Merged

core: on switching root do not emit device state change based on enumeration results#12013
keszybz merged 4 commits intosystemd:masterfrom
yuwata:fix-switchroot-11997

Conversation

@yuwata
Copy link
Copy Markdown
Member

@yuwata yuwata commented Mar 15, 2019

Fixes #11997.

yuwata added 3 commits March 15, 2019 18:59
When system manager is started first time or after switching root,
then the udev's device tag data do not exist yet.
So, let's not honor the enumeration results.

Fixes systemd#11997.
@yuwata
Copy link
Copy Markdown
Member Author

yuwata commented Mar 15, 2019

A test for the issue is added. I confirm that the test fails without the third commit.

@keszybz
Copy link
Copy Markdown
Member

keszybz commented Mar 19, 2019

LGTM. @poettering?

@owtaylor
Copy link
Copy Markdown
Contributor

Thanks for working on this @yuwata!

I rebuilt the Fedora 30 v241 packages with this patch set on top and:

  • I no longer see 'plugged -> dead -> plugged' for devices in the logs
  • It fixes the intermittent boot failures I was seeing before (worked 5 times in a row - was previously failing around 50% of the time)
  • My system apparently works fine otherwise

@yuwata
Copy link
Copy Markdown
Member Author

yuwata commented Mar 21, 2019

@owtaylor Thank you for testing this PR.

@keszybz keszybz requested a review from poettering March 26, 2019 09:14
@poettering
Copy link
Copy Markdown
Member

So I am not sure about this one. I think we actually do the right thing here currently in systemd, as we make sure the .device unit state stays in sync with the udev db. The problem though is that the udev db is flushed out during the transition (initrd-udevadm-cleanup-db.service does that), though I really wonder why we do that, we really shouldn't...

@keszybz
Copy link
Copy Markdown
Member

keszybz commented Apr 2, 2019

After a process of elimination, it seems that this PR is the one to merge. I grok the fact that we might get some stale device information if the device is unplugged at the wrong time during boot, but this seems to be mostly a theoretical problem. Let's merge this, and maybe we can come up with some solution for this potential problem later.

@owtaylor reports that this fixes his case, and my laptop with encrypted LVM also boots successfully.

@keszybz keszybz merged commit 237ebf6 into systemd:master Apr 2, 2019
@poettering
Copy link
Copy Markdown
Member

urks, i wished this wasn't merge. This is racy... if pid1 is reloaded between the switch root and udev being started up everything is still fucked...

@yuwata yuwata deleted the fix-switchroot-11997 branch April 2, 2019 15:32
@keszybz
Copy link
Copy Markdown
Member

keszybz commented Apr 3, 2019

Well, the state before this patch was broken (failed boots on Fedora Silverblue), and there were three proposed PRs, two of which broke boot with LVM. With this patch, we at least boot to login prompt. I'd be very happy to merge a more "proper" solution, once its proposed.

Dunno, maybe we should revert both this, and the preceding changes that caused the problem, and start from scratch.

@fbuihuu
Copy link
Copy Markdown
Contributor

fbuihuu commented Jul 3, 2019

Dunno, maybe we should revert both this, and the preceding changes that caused the problem, and start from scratch.

Definitively, especially since it's still broken : see https://bugzilla.suse.com/show_bug.cgi?id=1137373.

And I'm not sure why it was reworked in the first place (in commit 66f3fdb) but it's broken since then :-/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

During switchroot, devices transition plugged -> dead -> plugged

5 participants