Skip to content

Allow non-system users/groups in networkd and udevd again#40612

Merged
yuwata merged 4 commits intosystemd:mainfrom
keszybz:udevd-networkd-allow-users-again
Feb 13, 2026
Merged

Allow non-system users/groups in networkd and udevd again#40612
yuwata merged 4 commits intosystemd:mainfrom
keszybz:udevd-networkd-allow-users-again

Conversation

@keszybz
Copy link
Member

@keszybz keszybz commented Feb 9, 2026

No description provided.

@keszybz keszybz requested review from poettering and yuwata February 9, 2026 11:02
@github-actions github-actions bot added udev network documentation please-review PR is ready for (re-)review by a maintainer labels Feb 9, 2026
@poettering
Copy link
Member

So I think we have to do this for compat, because it apparently tripped up a lot of people. I'd tighten the screws a bit, though: indicate that support for this is deprecated and will eventually go away (when and if precisely is tbd, but the msg should not mention that).

I think the delegation model is really broken if we pass ownership of device node inodes, and hence I really think we shold get rid of the concept eventually. however, not right-away.

(one of those days we should probably add a udev field that explicitly adds access to device nodes via ACLs to users/groups without giving anything else. i.e. ACL_USER=foobar:rw and ACL_GROUP=waldo:r or so

@poettering poettering added reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks and removed please-review PR is ready for (re-)review by a maintainer labels Feb 9, 2026
@bluca bluca linked an issue Feb 9, 2026 that may be closed by this pull request
@keszybz keszybz force-pushed the udevd-networkd-allow-users-again branch from d87e6a1 to 3d42f7a Compare February 9, 2026 22:51
@github-actions github-actions bot added catalog please-review PR is ready for (re-)review by a maintainer and removed reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Feb 9, 2026
@septatrix
Copy link
Contributor

So I think we have to do this for compat, because it apparently tripped up a lot of people. I'd tighten the screws a bit, though: indicate that support for this is deprecated and will eventually go away (when and if precisely is tbd, but the msg should not mention that).

People will not check their logs as long as everything works. If this is allowed again, they will just do nothing and wait it out until it is removed again and scream anew.

(one of those days we should probably add a udev field that explicitly adds access to device nodes via ACLs to users/groups without giving anything else. i.e. ACL_USER=foobar:rw and ACL_GROUP=waldo:r or so

This "one of these days" should be "before we disallow this again"


The main problem here is/was the lack of communication why this was changed in the first place. Without a good reason users have no understanding and are upset about a change that was done seemingly without any justification.

@keszybz keszybz force-pushed the udevd-networkd-allow-users-again branch from 3d42f7a to 49304c3 Compare February 10, 2026 07:38
@github-actions github-actions bot added the tests label Feb 10, 2026
@yuwata
Copy link
Member

yuwata commented Feb 10, 2026

Thank you for working on this.

We really need to do something for the compat break. That's true.

But, the reason I did is for security concern, that is, a udev .rules file and sysusers conf file may be installed later after conflicting regular user is created. In such case, a device node may be unexpectedly owned by a regular user. Say, if a regular user or group clock may be already exist, then now clock devices may be controlled by the user or users in the group.
When it is allowed, people needs to check all accounts set in OWNER=/GROUP= are actually system accounts on every package upgrade. Otherwise, spurious ownership may be sneaked on package upgrade or system upgrade.

So, I think we should still not allow device nodes or network interfaces being owned by regular users/groups by default.
To keep backward compat, maybe better to introduce a environment variable to relax the restriction, and warn about the check is relaxed and that may cause security issues.

@yuwata
Copy link
Member

yuwata commented Feb 10, 2026

Of course, that's my bad that such the crucial change is made without any pre-announcements. But, anyway, even it was done little bit brutally, the action was already taken. So, I think we should not step back.

I think we should only do that here is providing a way to optionally relax the check, i.e. introducing an environment variable. Of course, that requires people who uses an 'exotic' setup needs to explicitly set the variable. But I think that's fine, as that makes people recognize he/she is doing something potentially dangerous.

@keszybz
Copy link
Member Author

keszybz commented Feb 10, 2026

Say, if a regular user or group clock may be already exist, then now clock devices may be controlled by the user or users in the group. When it is allowed, people needs to check all accounts set in OWNER=/GROUP= are actually system accounts on every package upgrade. Otherwise, spurious ownership may be sneaked on package upgrade or system upgrade.

I don't understand this reasoning. The problem is that a group existed and we now try to define a new group with this name and there is confusion. Whether the group is a system group or not is mostly irrelevant. There is no clear distinction between user and system groups, users are routinely members of "system groups".

To keep backward compat, maybe better to introduce a environment variable to relax the restriction, and warn about the check is relaxed and that may cause security issues.

No, that's not a solution. Requiring an environment variable to be set means that we break the system and then allow users to unbreak it. We should reserve that for extreme cases. By default, things should just continue to work. If we want to deprecate something, then we should go through a cycle of announcements, have the code emit warnings, etc. Not yank something out from underneath our users.

@poettering poettering added reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks and removed please-review PR is ready for (re-)review by a maintainer labels Feb 10, 2026
This reverts (in sprit) commit f5cdf95,
"udev-rules: ignore non-system user/group in OWNER=/GROUP=".

The original change was done to clean up a situation where we added a
new group, but the group could already have been used for some other
purposes, and now the some unexpected entity would own the device.
Unfortunately, this check doesn't really address the issue, since the
existing account might as well be a system account, which might be
equally bad. In addition, this change is a big compatiblity break,
causing existing rules to stop working. Since quite a lot of systems
have local configuration to assign devices to users for various
purposes, this is very noticable to users. In a way, the original change
to add a new group was the compat break, and follow-up patch to cahnge
the rule parsing evolved a small compat break into a much bigger one.
There is merit to the change though, since device nodes shouldn't be
owned by users and groups and different mechanisms should be used
instead. To avoid breaking users systems, and since the original goal
cannot be achieved by this patch, let's downgrade this to a warning
to guide users towards different solutions.
This reverts (in spirit) "network/tuntap: deny from owning Tun/Tap interfaces",
commit 940441b.

Justification similar as in the previous commit. The check is only
partially connected to the intended purpose and breaks backwards compat
without a sufficient reason.

Alternative fix for systemd#37279.
@keszybz keszybz force-pushed the udevd-networkd-allow-users-again branch from 49304c3 to efb7717 Compare February 11, 2026 10:39
@github-actions github-actions bot added please-review PR is ready for (re-)review by a maintainer and removed reviewed/needs-rework 🔨 PR has been reviewed and needs another round of reworks labels Feb 11, 2026
@keszybz
Copy link
Member Author

keszybz commented Feb 11, 2026

Updated.

@eqvinox
Copy link

eqvinox commented Feb 11, 2026

After looking at f5cdf95, what I understood as the (sole) intent of the change was preventing new groups that a system/distribution might add (in that case clock) from accidentally clashing with an existing user-setup group. Which could give users unwanted access to some devices.

This change does not achieve that goal; this is why I introduced SYSOWNER SYSGROUP in #40597 (and made it automatically consider OWNER as SYSOWNER and GROUP as SYSGROUP for all rules in /lib/udev/rules.d).

This guard can also be achieved in other ways, particularly since new groups would be added by some distro/system action to begin with (e.g. package script's post-install calling groupadd), which can also be a location to check for non-system groups with the same name. But either way I felt I should note this regression in safety checks.

@eqvinox
Copy link

eqvinox commented Feb 11, 2026

Of course, that's my bad that such the crucial change is made without any pre-announcements. But, anyway, even it was done little bit brutally, the action was already taken. So, I think we should not step back.

I think we should only do that here is providing a way to optionally relax the check, i.e. introducing an environment variable. Of course, that requires people who uses an 'exotic' setup needs to explicitly set the variable. But I think that's fine, as that makes people recognize he/she is doing something potentially dangerous.

According to reports, the change is actively breaking systems to the degree that input just straight up doesn't work anymore. If the fix is to add an option to relax the check, that option would certainly need to be set by default.

By the discussion in #40597, there is nothing potentially dangerous here; it's primarily just racy on systems that have external dependencies for user/group lookups. This would already fail closed due to not resolving some user/group in almost any circumstance; if needed something like defaulting to uid/gid 0 for names that couldn't be resolved could further tighten that fail-closed behavior.

@alexvkaam
Copy link

I think we should only do that here is providing a way to optionally relax the check, i.e. introducing an environment variable. Of course, that requires people who uses an 'exotic' setup needs to explicitly set the variable. But I think that's fine, as that makes people recognize he/she is doing something potentially dangerous.

I do not pretend to understand everything here, but I would like to say that my 'exotic' system is a 6 month old clean tumbleweed install with KDE plasma wayland, a monitor, a USB keyboard, USB mouse and USB printer and everything USB simply stopped working. It might be a combo of KDE, plasma, systemd, wayland specific but it's definitely not 'exotic'

@yuwata
Copy link
Member

yuwata commented Feb 13, 2026

@keszybz

Say, if a regular user or group clock may be already exist, then now clock devices may be controlled by the user or users in the group. When it is allowed, people needs to check all accounts set in OWNER=/GROUP= are actually system accounts on every package upgrade. Otherwise, spurious ownership may be sneaked on package upgrade or system upgrade.

I don't understand this reasoning. The problem is that a group existed and we now try to define a new group with this name and there is confusion. Whether the group is a system group or not is mostly irrelevant. There is no clear distinction between user and system groups, users are routinely members of "system groups".

The previous change does not avoid the risk when a system group existed and the same system group is now introduced for a different purpose. But for other situations that a regular user and corresponding regular group existed, and now a system group with the same name is introduced on update or so.

To keep backward compat, maybe better to introduce a environment variable to relax the restriction, and warn about the check is relaxed and that may cause security issues.

No, that's not a solution. Requiring an environment variable to be set means that we break the system and then allow users to unbreak it. We should reserve that for extreme cases. By default, things should just continue to work. If we want to deprecate something, then we should go through a cycle of announcements, have the code emit warnings, etc. Not yank something out from underneath our users.

As said in the above, that's intentional. To make the users recognize what they do. Or, if a distro wants to strict compat (may be for RHEL), then they can introduce a drop-in config to set the environment variable.

If the change was done in this development cycle, then that's fine to revert the change. But the change was done in v258, even not in v259. That's simply too late. Reverting now, and reintroduce the change later is terrible decision. People who use 'exotic' setup will never change there way even we revert the change now, and will complain again when we reenable it, even if we announce loudly. You know, announcement does not change their setup. They will continue to use their setups. We should not step back the change when the change is good at least for long time span and we should anyway enable that in a future.

(Though, as I said in the previous comment, that does not mean the previous process that made the change is good. Actually we should once announce before the change was made. But, that's too late, unfortunately.)

In short, my opinion is,

  • we should provide a way to 'support' such 'exotic' setups through env var,
  • we should not change the default behavior,
  • if necessary, distro can provide a drop-in to set env var, but upstream should not do.

@yuwata yuwata added the rc-blocker 🚧 PRs and Issues tagged this way are blocking the upcoming rc release! label Feb 13, 2026
@keszybz
Copy link
Member Author

keszybz commented Feb 13, 2026

As said in the above, that's intentional. To make the users recognize what they do. Or, if a distro wants to strict compat (may be for RHEL), then they can introduce a drop-in config to set the environment variable.

I think this is where we view things differently. I want to follow the policy described in f4dd927, i.e. revert changes which are reported to cause breakage for users. And I really mean to undo them, i.e. keep things functional for users without any immediate action on their part. The problem with the approach with an environment variable is that it requires the user to take specific complicated action. This simply doesn't scale — an installation is composed of thousands of packages, incl. hundreds of running daemons, and every time something stops working without a slow deprecation period it is a cost for users who have to scramble to fix something after an upgrade. This is the model I really want to move away from, for systemd and for every other component of the system.

Please note that the reports that this is causing problem were made in #39056, four days after the release of v258. And it's not just one person, it's a bunch of different people. So it's not "exotic setups", its clearly a fairly common setup.

  • we should provide a way to 'support' such 'exotic' setups through env var,
  • we should not change the default behavior,
  • if necessary, distro can provide a drop-in to set env var, but upstream should not do.

What I want to happen is:

  • Merge this PR so that users' systems continue to work
  • We added info in NEWS and are now emitting deprecation notices so users who look at logs will learn about the issue.
  • Somebody should implement the proposed functionality to set ACLs through a declarative rule, without shelling out. We can then document this as the appropriate way to delegate privileges.
  • Then we decide what to do. One option would be extend the warning to suggest using the new delegation mechanism, another option might be to simply automatically degrade OWNER=/GROUP= to an ACL for non-system accounts. The overall goal is to give the users a clear off-ramp how to change their system that keeps the desired functionality without hacks.

@yuwata
Copy link
Member

yuwata commented Feb 13, 2026

OK. Let's take that way. It is of course compat friendly.

@yuwata yuwata merged commit a1c8e5e into systemd:main Feb 13, 2026
52 of 56 checks passed
@github-actions github-actions bot removed the please-review PR is ready for (re-)review by a maintainer label Feb 13, 2026
@yuwata yuwata removed ci-failure-appears-unrelated rc-blocker 🚧 PRs and Issues tagged this way are blocking the upcoming rc release! labels Feb 13, 2026
@filipmnowak
Copy link

Otherwise, spurious ownership may be sneaked on package upgrade or system upgrade.

well, i am aware already of two cases in which people were chmoding to 777 hell out of /dev to "make it work" again.
breaking change is well-intentioned, sure, but effect might be exactly opposite of what we understand by "security" (and also "user experience").

@keszybz keszybz deleted the udevd-networkd-allow-users-again branch February 16, 2026 09:47
@max0x7ba
Copy link

max0x7ba commented Mar 7, 2026

I think we should only do that here is providing a way to optionally relax the check, i.e. introducing an environment variable. Of course, that requires people who uses an 'exotic' setup needs to explicitly set the variable. But I think that's fine, as that makes people recognize he/she is doing something potentially dangerous.

I do not pretend to understand everything here, but I would like to say that my 'exotic' system is a 6 month old clean tumbleweed install with KDE plasma wayland, a monitor, a USB keyboard, USB mouse and USB printer and everything USB simply stopped working. It might be a combo of KDE, plasma, systemd, wayland specific but it's definitely not 'exotic'

It broke USB keyboard handling in stock KDE Neon install too, with its anti-exotic stock configuration. KDE Neon is Ubuntu with latest KDE Plasma desktop on top.

The breaking change solves some theoretical problem in rare exotic obscure use-cases by breaking all other use-cases. The new behaviour breaks many existing applications. It is a regression and bug in systemd which should not have passed through CI/CD unit-tests and code reviews -- a development process systems failure.

The deluge of bug reports flags this bug the in-house development processes failed to detect. The right course of action is to thank users for free of charge QA and bug reporting (many eye-balls make bugs in open-source software shallow), write unit-tests reproducing the bug and fix the bug.

Rejecting bug reports, not admitting the error, and blaming previously perfectly working applications and configurations is acting in bad faith to cover-up mistakes.

Mistakes are inevitable. Admitting mistakes and correcting them builds character, trust and respect. Denying and covering up mistakes destroys trust and respect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Allow regular users and groups own device nodes

9 participants