Skip to content

credentials logic to pass privileged data to services#16568

Merged
keszybz merged 11 commits intosystemd:masterfrom
poettering:creds-store
Aug 26, 2020
Merged

credentials logic to pass privileged data to services#16568
keszybz merged 11 commits intosystemd:masterfrom
poettering:creds-store

Conversation

@poettering
Copy link
Copy Markdown
Member

An initial implementation of what I distilled from the discussions in #15778 #16060.

Adds LoadCredential=/SetCredential= to services and --load-credential=/--set-credential= to nspawn, so that we can pass credential data from host service manager to nspawn (running as service) down to the service manager inside the container and down to a service inside it.

tries hard to place the data in an immutable, non-swappable file system only accessble to the unit's UID (and root). This means it tries to use "ramfs" (which opposed to tmpfs is not swappable), but has fallbacks to use tmpfs as fallback and even no fs at all (and ACLs instead).

Services can pick up the creds via the file system: $CREDENTIALS_DIRECTORY contains the path where the data is placed, with one file for each credential, names are picked by the user.

I opted against using the kernel keyring (since it is incompatible with containers and not widely supported, and requires patching daemons to pick things up) or memfds (since this requires patching of servicesto pick things up), and decided to just use the plain old filesystem since it should just work for all cases. That said, we might want to add support for memfds/keyring later on, either for picking up stuff from there or for placing stuff there, or maybe both.

This is not ready to merge yet, comes without any test cases. But let's get the discussion started.

@boucman
Copy link
Copy Markdown
Contributor

boucman commented Jul 23, 2020

I think this should be documented in the CONTAINER_INTERFACE too...

@poettering
Copy link
Copy Markdown
Member Author

I think this should be documented in the CONTAINER_INTERFACE too...

There's nothing contianer specific in this btw, the API is the same when passing stuff from container mgr to payload and from service manager to service: set the env var to the dir with the creds. But I guess you are right, the fact that this is honoured should be mentioned in the container interface doc

@arianvp
Copy link
Copy Markdown
Contributor

arianvp commented Jul 28, 2020

tries hard to place the data in an immutable, non-swappable file system only accessble to the unit's UID (and root). This means it tries to use "ramfs" (which opposed to tmpfs is not swappable), but has fallbacks to use tmpfs as fallback and even no fs at all (and ACLs instead).

Would it be possible to make it configurable whether this fallback happens or instead complain loudly? I can imagine .e.g a portable systemd unit or package wanting to describe the fact that it should only really really work on nodes that have ramfs enabled. Perhaps some ConditionX= ?

@poettering
Copy link
Copy Markdown
Member Author

Would it be possible to make it configurable whether this fallback happens or instead complain loudly? I can imagine .e.g a portable systemd unit or package wanting to describe the fact that it should only really really work on nodes that have ramfs enabled. Perhaps some ConditionX= ?

Dunno. Maybe we can warn once or so, but I figured that many servers don't have swap anyway, and many laptops have encrypted swap, hence actually using tmpfs over ramfs isn't too bad. In fact, it might even be preferable to use tmpfs over ramfs if there's no swap or only encrypted swap, since it has size limits and working "df" and so on.

@poettering
Copy link
Copy Markdown
Member Author

I pushed a new version. Comes with test case and everything else tht was missing. Should be good fore review/merge now. Dropping the "dont-merge" label hence.

@poettering poettering changed the title RFC: credentials logic to pass privileged data to services credentials logic to pass privileged data to services Aug 14, 2020
@aanderse
Copy link
Copy Markdown

Does this resolve symlinks while pulling in credential files?

@poettering
Copy link
Copy Markdown
Member Author

Does this resolve symlinks while pulling in credential files?

Yes. Why?

@aanderse
Copy link
Copy Markdown

Awesome 👍 While scanning through the docs (quickly) I didn't notice if this resolved symlinks or not. certbot was the example that came first to mind.

Thanks for your great work @poettering 🎉

Ideally we would like to hide all other service's credentials for all
services. That would imply for us to enable mount namespacing for all
services, which is something we cannot do, both due to compatibility
with the status quo ante, and because a number of services legitimately
should be able to install mounts in the host hierarchy.

Hence we do the second best thing, we hide the credentials automatically
for all services that opt into mount namespacing otherwise. This is
quite different from other mount sandboxing options: usually you have to
explicitly opt into each. However, given that the credentials logic is a
brand new concept we invented right here and now, and particularly
security sensitive it's OK to reverse this, and by default hide
credentials whenever we can (i.e. whenever mount namespacing is
otherwise opt-ed in to).

Long story short: if you want to hide other service's credentials, the
most basic options is to just turn on PrivateMounts= and there you go,
they should all be gone.
Let's allow passing in creds to containers, so that PID 1 inside the
container can pick them up.
@poettering
Copy link
Copy Markdown
Member Author

Force pushed a new version. Only changes are the suggested ones, but see above. Taking liberity to upgrade green label since this was pretty much all doc stuff.

@poettering poettering added good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed and removed good-to-merge/with-minor-suggestions labels Aug 25, 2020
@keszybz
Copy link
Copy Markdown
Member

keszybz commented Aug 26, 2020

bionic-i386 timed out, and bionic-ppc64el failed in TEST-36-NUMAPOLICY. Looks unrelated.


<orderedlist>
<listitem><para>When <command>systemd-nspawn</command> runs as systemd system service it can make
use and propagate credentials it received via
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can systemd use credentials for anything by itself? I think this could be changed to "it can propagate" without loss of meaning.

@keszybz keszybz added ci-failure-appears-unrelated and removed good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed labels Aug 26, 2020
@keszybz keszybz merged commit b6abc2a into systemd:master Aug 26, 2020
@arianvp
Copy link
Copy Markdown
Contributor

arianvp commented Aug 29, 2020

Awesome! This probably warrants an entry to NEWS right?

@keszybz
Copy link
Copy Markdown
Member

keszybz commented Aug 30, 2020

NEWS will be updated before the release.

@flokli
Copy link
Copy Markdown
Contributor

flokli commented Oct 25, 2020

This has been added to NEWS during 3220cf3.

@Mic92
Copy link
Copy Markdown
Contributor

Mic92 commented Nov 1, 2020

I want to implement the socket protocol into https://github.com/Mic92/sops-nix. Do I need one unix socket per secret than?

@poettering
Copy link
Copy Markdown
Member Author

Yes, in the currently merged version of the logic you to have one bound and listening AF_UNIX socket per credential if you want to serve multiple different creds, since the connection coming in is minimalistic, comes with zero metadata.

That said, I just posted #17510 which beefs this logic up a bit. Once merged, the AF_UNIX connection will come in from an AF_UNIX abstract namespace socket name that includes info on the unit name and the credential ID requested. Thus, with that you can have some credential service that listens on a single socket, and for each incoming connection you can do getpeername() and it tells you which unit this is for, and which credential is requested.

I kinda like this appraoch, since it keeps things absolutely minimalist, but still transfers a bit of metadata along that is truly useful to have.

(And of course you can use SO_PEERCRED to get some more meta info of the client requesting things, for example the PID it returns is going to be the main process of the service, which is already pretty useful).

@Mic92
Copy link
Copy Markdown
Contributor

Mic92 commented Nov 3, 2020

@poettering Thanks! This make my implementation a lot easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

7 participants