Skip to content

Conversation

@thaJeztah
Copy link
Member

This syncs the seccomp-profile with the latest changes in containerd's profile, applying the same changes as containerd/containerd@17a9324

Some background from the associated ticket:

We want to use vsock for guest-host communication on KubeVirt
(https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.

However since anyone can just connect from any pod to any VM with the
default seccomp settings, we cannot limit connection attempts to our
privileged node-agent.

Describe the solution you'd like

We want to deny the socket syscall for the AF_VSOCK family by default.

I see in 1 and 2 that AF_VSOCK was actually already blocked for some
time, but that got reverted since some architectures support the socketcall
syscall which can't be restricted properly. However we are mostly interested
in arm64 and amd64 where limiting socket would probably be enough.

Additional context

I know that in theory we could use our own seccomp profiles, but we would want
to provide security for as many users as possible which use KubeVirt, and there
it would be very helpful if this protection could be added by being part of the
DefaultRuntime profile to easily ensure that it is active for all pods 3.

Impact on existing workloads: It is unlikely that this will disturb any existing
workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
However if someone would still use it: Privileged pods would still be able to
use socket for AF_VSOCK, custom seccomp policies could be applied too.
Further it was already blocked for quite some time and the blockade got lifted
due to reasons not related to AF_VSOCK.

The PR in KubeVirt which adds VSOCK support for additional context: 4

- Description for the changelog

- seccomp: AF_VSOCK is not blocked by default in the default profile

- A picture of a cute animal (not mandatory but encouraged)

This syncs the seccomp-profile with the latest changes in containerd's
profile, applying the same changes as containerd/containerd@17a9324

Some background from the associated ticket:

> We want to use vsock for guest-host communication on KubeVirt
> (https://github.com/kubevirt/kubevirt). In KubeVirt we run VMs in pods.
>
> However since anyone can just connect from any pod to any VM with the
> default seccomp settings, we cannot limit connection attempts to our
> privileged node-agent.
>
> ### Describe the solution you'd like
> We want to deny the `socket` syscall for the `AF_VSOCK` family by default.
>
> I see in [1] and [2] that AF_VSOCK was actually already blocked for some
> time, but that got reverted since some architectures support the `socketcall`
> syscall which can't be restricted properly. However we are mostly interested
> in `arm64` and `amd64` where limiting `socket` would probably be enough.
>
> ### Additional context
> I know that in theory we could use our own seccomp profiles, but we would want
> to provide security for as many users as possible which use KubeVirt, and there
> it would be very helpful if this protection could be added by being part of the
> DefaultRuntime profile to easily ensure that it is active for all pods [3].
>
> Impact on existing workloads: It is unlikely that this will disturb any existing
> workload, becuase VSOCK is almost exclusively used for host-guest commmunication.
> However if someone would still use it: Privileged pods would still be able to
> use `socket` for `AF_VSOCK`, custom seccomp policies could be applied too.
> Further it was already blocked for quite some time and the blockade got lifted
> due to reasons not related to AF_VSOCK.
>
> The PR in KubeVirt which adds VSOCK support for additional context: [4]
>
> [1]: moby#29076 (comment)
> [2]: moby@dcf2632
> [3]: https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads
> [4]: kubevirt/kubevirt#8546

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah
Copy link
Member Author

/cc @gabriellavengeo

@thaJeztah
Copy link
Member Author

Let's get these in

@thaJeztah thaJeztah merged commit 40408d1 into moby:master Dec 1, 2022
@thaJeztah thaJeztah deleted the seccomp_block_af_vsock branch December 1, 2022 20:39
eugkoira added a commit to aws/aws-nitro-enclaves-sdk-c that referenced this pull request Mar 6, 2024
Since Docker 24.x `socket` syscall for `vsock` argument is restricted through seccomp rules by default:
moby/moby#44562

It should be safe to lift those seccomp restrictions completely with a dedicated flag when launching container on parent instance. Container is not used for isolation here, but more for reproducible environment.
meerd pushed a commit to aws/aws-nitro-enclaves-sdk-c that referenced this pull request Mar 6, 2024
Since Docker 24.x `socket` syscall for `vsock` argument is restricted through seccomp rules by default:
moby/moby#44562

It should be safe to lift those seccomp restrictions completely with a dedicated flag when launching container on parent instance. Container is not used for isolation here, but more for reproducible environment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants