Allow non-system users/groups in networkd and udevd again#40612
Allow non-system users/groups in networkd and udevd again#40612yuwata merged 4 commits intosystemd:mainfrom
Conversation
|
So I think we have to do this for compat, because it apparently tripped up a lot of people. I'd tighten the screws a bit, though: indicate that support for this is deprecated and will eventually go away (when and if precisely is tbd, but the msg should not mention that). I think the delegation model is really broken if we pass ownership of device node inodes, and hence I really think we shold get rid of the concept eventually. however, not right-away. (one of those days we should probably add a udev field that explicitly adds access to device nodes via ACLs to users/groups without giving anything else. i.e. |
d87e6a1 to
3d42f7a
Compare
People will not check their logs as long as everything works. If this is allowed again, they will just do nothing and wait it out until it is removed again and scream anew.
This "one of these days" should be "before we disallow this again" The main problem here is/was the lack of communication why this was changed in the first place. Without a good reason users have no understanding and are upset about a change that was done seemingly without any justification. |
3d42f7a to
49304c3
Compare
|
Thank you for working on this. We really need to do something for the compat break. That's true. But, the reason I did is for security concern, that is, a udev .rules file and sysusers conf file may be installed later after conflicting regular user is created. In such case, a device node may be unexpectedly owned by a regular user. Say, if a regular user or group So, I think we should still not allow device nodes or network interfaces being owned by regular users/groups by default. |
|
Of course, that's my bad that such the crucial change is made without any pre-announcements. But, anyway, even it was done little bit brutally, the action was already taken. So, I think we should not step back. I think we should only do that here is providing a way to optionally relax the check, i.e. introducing an environment variable. Of course, that requires people who uses an 'exotic' setup needs to explicitly set the variable. But I think that's fine, as that makes people recognize he/she is doing something potentially dangerous. |
I don't understand this reasoning. The problem is that a group existed and we now try to define a new group with this name and there is confusion. Whether the group is a system group or not is mostly irrelevant. There is no clear distinction between user and system groups, users are routinely members of "system groups".
No, that's not a solution. Requiring an environment variable to be set means that we break the system and then allow users to unbreak it. We should reserve that for extreme cases. By default, things should just continue to work. If we want to deprecate something, then we should go through a cycle of announcements, have the code emit warnings, etc. Not yank something out from underneath our users. |
This reverts (in sprit) commit f5cdf95, "udev-rules: ignore non-system user/group in OWNER=/GROUP=". The original change was done to clean up a situation where we added a new group, but the group could already have been used for some other purposes, and now the some unexpected entity would own the device. Unfortunately, this check doesn't really address the issue, since the existing account might as well be a system account, which might be equally bad. In addition, this change is a big compatiblity break, causing existing rules to stop working. Since quite a lot of systems have local configuration to assign devices to users for various purposes, this is very noticable to users. In a way, the original change to add a new group was the compat break, and follow-up patch to cahnge the rule parsing evolved a small compat break into a much bigger one. There is merit to the change though, since device nodes shouldn't be owned by users and groups and different mechanisms should be used instead. To avoid breaking users systems, and since the original goal cannot be achieved by this patch, let's downgrade this to a warning to guide users towards different solutions.
This reverts (in spirit) "network/tuntap: deny from owning Tun/Tap interfaces", commit 940441b. Justification similar as in the previous commit. The check is only partially connected to the intended purpose and breaks backwards compat without a sufficient reason. Alternative fix for systemd#37279.
49304c3 to
efb7717
Compare
|
Updated. |
|
After looking at f5cdf95, what I understood as the (sole) intent of the change was preventing new groups that a system/distribution might add (in that case This change does not achieve that goal; this is why I introduced This guard can also be achieved in other ways, particularly since new groups would be added by some distro/system action to begin with (e.g. package script's post-install calling groupadd), which can also be a location to check for non-system groups with the same name. But either way I felt I should note this regression in safety checks. |
According to reports, the change is actively breaking systems to the degree that input just straight up doesn't work anymore. If the fix is to add an option to relax the check, that option would certainly need to be set by default. By the discussion in #40597, there is nothing potentially dangerous here; it's primarily just racy on systems that have external dependencies for user/group lookups. This would already fail closed due to not resolving some user/group in almost any circumstance; if needed something like defaulting to uid/gid 0 for names that couldn't be resolved could further tighten that fail-closed behavior. |
I do not pretend to understand everything here, but I would like to say that my 'exotic' system is a 6 month old clean tumbleweed install with KDE plasma wayland, a monitor, a USB keyboard, USB mouse and USB printer and everything USB simply stopped working. It might be a combo of KDE, plasma, systemd, wayland specific but it's definitely not 'exotic' |
The previous change does not avoid the risk when a system group existed and the same system group is now introduced for a different purpose. But for other situations that a regular user and corresponding regular group existed, and now a system group with the same name is introduced on update or so.
As said in the above, that's intentional. To make the users recognize what they do. Or, if a distro wants to strict compat (may be for RHEL), then they can introduce a drop-in config to set the environment variable. If the change was done in this development cycle, then that's fine to revert the change. But the change was done in v258, even not in v259. That's simply too late. Reverting now, and reintroduce the change later is terrible decision. People who use 'exotic' setup will never change there way even we revert the change now, and will complain again when we reenable it, even if we announce loudly. You know, announcement does not change their setup. They will continue to use their setups. We should not step back the change when the change is good at least for long time span and we should anyway enable that in a future. (Though, as I said in the previous comment, that does not mean the previous process that made the change is good. Actually we should once announce before the change was made. But, that's too late, unfortunately.) In short, my opinion is,
|
I think this is where we view things differently. I want to follow the policy described in f4dd927, i.e. revert changes which are reported to cause breakage for users. And I really mean to undo them, i.e. keep things functional for users without any immediate action on their part. The problem with the approach with an environment variable is that it requires the user to take specific complicated action. This simply doesn't scale — an installation is composed of thousands of packages, incl. hundreds of running daemons, and every time something stops working without a slow deprecation period it is a cost for users who have to scramble to fix something after an upgrade. This is the model I really want to move away from, for systemd and for every other component of the system. Please note that the reports that this is causing problem were made in #39056, four days after the release of v258. And it's not just one person, it's a bunch of different people. So it's not "exotic setups", its clearly a fairly common setup.
What I want to happen is:
|
|
OK. Let's take that way. It is of course compat friendly. |
well, i am aware already of two cases in which people were chmoding to 777 hell out of |
It broke USB keyboard handling in stock KDE Neon install too, with its anti-exotic stock configuration. KDE Neon is Ubuntu with latest KDE Plasma desktop on top. The breaking change solves some theoretical problem in rare exotic obscure use-cases by breaking all other use-cases. The new behaviour breaks many existing applications. It is a regression and bug in systemd which should not have passed through CI/CD unit-tests and code reviews -- a development process systems failure. The deluge of bug reports flags this bug the in-house development processes failed to detect. The right course of action is to thank users for free of charge QA and bug reporting (many eye-balls make bugs in open-source software shallow), write unit-tests reproducing the bug and fix the bug. Rejecting bug reports, not admitting the error, and blaming previously perfectly working applications and configurations is acting in bad faith to cover-up mistakes. Mistakes are inevitable. Admitting mistakes and correcting them builds character, trust and respect. Denying and covering up mistakes destroys trust and respect. |
No description provided.