Skip to content

Device fix plug dead plug again#23436

Closed
fbuihuu wants to merge 4 commits intosystemd:mainfrom
fbuihuu:device-fix-plug-dead-plug-again
Closed

Device fix plug dead plug again#23436
fbuihuu wants to merge 4 commits intosystemd:mainfrom
fbuihuu:device-fix-plug-dead-plug-again

Conversation

@fbuihuu
Copy link
Copy Markdown
Contributor

@fbuihuu fbuihuu commented May 19, 2022

No description provided.

fbuihuu added 4 commits May 18, 2022 17:54
Will be used by the following commit.
This is a slightly different approach than the one taken by commit
75d7b59 to fix issue systemd#12953 and systemd#23208.

This patch forces PID1 to forget all devices (except those with the
"db_persist" option see below) that were known by PID1 before switching root by
pretending that the devices were in DEAD state before being serialized. Hence
no artificial "plugged->dead" state transitions happen when PID1 is reexecuting
from a switch root followed by "dead->plugged" state transitions when all
devices are coldplugged with the new set of udev rule from the host.

As mentioned previously, devices with the "db_persistent" option are exceptions
of the previously described mechanism. Since these devices remain in the udev
DB even after the DB has been cleared, they still continue to be deserialized
in plugged state and remain in this state hence following the description of
the option. This should fix the regression introduced by
75d7b59.

Fixes: systemd#23429
Replaces: systemd#23218
…ice_coldplug()

It appears that device_coldplug() isn't really necessary: both the device
"state" and "found" properties can be restored when deserializing the device
unit. While enumerating and during the coldplug phase only "enumerated_found"
is used to calculate the new "found" flags, "state" and "found" being not
considered during this phase. Ultimately "enumerated_found" and the previous
state of the device are used to calculate the new state (if it's changed) in
the "catchup" phase, which is the phase where unit state change events are
generated.
@fbuihuu
Copy link
Copy Markdown
Contributor Author

fbuihuu commented May 19, 2022

@yuwata here's the PR you asked me to create for discussing a possible and simpler way to fix issues addressed by #23218. The second commit contains the fix whereas the last one is a tentative to drop device_coldplug() as it seems to not be needed after all.

Copy link
Copy Markdown
Member

@yuwata yuwata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first and third commits LGTM. I misunderstood them previously, sorry.

However, the second and fourth commits look not correct. Please see the comments below.

* for devices that were unplugged during the initrd transition. */

found = DEVICE_NOT_FOUND;
state = DEVICE_DEAD;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This re-introduces the previous plugged -> dead -> plugged issue for devices without persistent database.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this sounds really wrong.

When this happens (due to PID1switching root), udev is inactive and its DB has been cleared

So this sounds like the issue is caused by this clearing of the database. I know this used to be done, but let's not try to hardcode this (IMHO completely broken) behaviour in even more places.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry but I don't get this. Can you please elaborate ?

Copy link
Copy Markdown
Member

@yuwata yuwata May 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, if a device does NOT have persistent database, then the device is going to dead state here, and later possibly goes to plugged state on uevent. Hence, if the device is already mounted on somewhere, it will be once unmounted. Thus, the original issue is re-introduced.

If a device HAS persistent databse, then the up-to-date flags (and state) will be applied in device_catchup(). Hence, it is not necessary to downgrade the flag and state here. So, the condition added by the second commit is correct.

So, a correct way to fix #23429 is simply add the condition.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the case that a device does NOT have persistent database, and it is mounted on somewhere.
With this PR, the device will enter the dead state, as enumerated_found does not have DEVICE_FOUND_UDEV flag when device_coldplug() is called. Hence, the corresponding mount unit will be stopped. After systemd-udevd processes the device, then the device unit again goes to the plugged state, and the mount unit is restarted.

This is mostly equivalent to the original issue, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

If the device is mounted, then d->enumerated_found=DEVICE_FOUND_MOUNT and the device enters in "tentative state".

Then later udevd is started and the events are replayed. From this point, the device enters in "plugged" state through tentative->plugged transition, which doesn't stop the mount unit.

I think you should give it a try.

assert(d);

/* Second, let's update the state with the enumerated state */
device_update_found_one(d, d->enumerated_found, DEVICE_FOUND_MASK);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does update Device.found and Device.state, and may not trigger required transactions.
E.g. on reload, if d->state == DEVICE_PLUGGED and d->found == d->enumerated_found and they have DEVICE_FOUND_UDEV flag, then device_update_found_one() does not call device_set_state()...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I'm not seeing which transition that would be needed in your example ? plugged->plugged ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to call device_set_state() at least once in device_coldplug() or device_catchup(), even there is no transition like plugged -> plugged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ?

@keszybz
Copy link
Copy Markdown
Member

keszybz commented May 19, 2022

2022-05-19T07:47:33.3416115Z ../src/core/device.c: In function ‘device_init’:
2022-05-19T07:47:33.3417051Z ../src/core/device.c:95:17: error: unused variable ‘d’ [-Werror=unused-variable]
2022-05-19T07:47:33.3417572Z    95 |         Device *d = DEVICE(u);
2022-05-19T07:47:33.3417904Z       |                 ^
2022-05-19T07:47:33.3418267Z cc1: all warnings being treated as errors

@yuwata yuwata added pid1 reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels May 19, 2022
@fbuihuu
Copy link
Copy Markdown
Contributor Author

fbuihuu commented May 19, 2022

2022-05-19T07:47:33.3416115Z ../src/core/device.c: In function ‘device_init’:
2022-05-19T07:47:33.3417051Z ../src/core/device.c:95:17: error: unused variable ‘d’ [-Werror=unused-variable]
2022-05-19T07:47:33.3417572Z    95 |         Device *d = DEVICE(u);
2022-05-19T07:47:33.3417904Z       |                 ^
2022-05-19T07:47:33.3418267Z cc1: all warnings being treated as errors

That's strange because d is referenced by the 2 assertions just below the statement of d.

@yuwata
Copy link
Copy Markdown
Member

yuwata commented May 26, 2022

Superseded by #23517. Closing.

@yuwata yuwata closed this May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pid1 reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks

Development

Successfully merging this pull request may close these issues.

3 participants