Skip to content

Add WireGuard host2host and LB encryption#19401

Merged
ldelossa merged 13 commits intomasterfrom
pr/gandro+brb/wg-host-encryption-v2
Jan 24, 2023
Merged

Add WireGuard host2host and LB encryption#19401
ldelossa merged 13 commits intomasterfrom
pr/gandro+brb/wg-host-encryption-v2

Conversation

@brb
Copy link
Copy Markdown
Member

@brb brb commented Apr 11, 2022

This PR adds support for node-to-node encryption to WireGuard. To achieve this, we've completely changed the WireGuard integration in the datapath. Previously, WireGuard support was implemented by marking packets to be encrypted in "from-container" and redirecting it to the WireGuard tunnel via a hostns IP rule. This worked fine for traffic originating in pods - but for node-to-node traffic, we need to redirect the packets on the outgoing network interface. Thus, the new implementation attaches bpf_host to the outgoing device and redirects packets to the WireGuard tunnel from there. See commit descriptions for more details.

On the agent side, there are also changes to the implementation. Previously, the datapath assumed that any IPCache entry with an associated tunnel endpoint would need encryption. To determine if we need to encryption traffic to a remote endpoint, we now rely on the encrypt_key field instead. This allows us to more precisely track if traffic to a particular destination needs to be encrypted, and allows certain nodes to opt out of encryption (see below). The agent code has been updated to populate the CiliumEndpoint and CiliumNode CRDs with a static non-zero EncryptKey value if encryption for those resources is enabled.

Additional points worth noting:

ℹ️ Please see commit messages for many more details

Joint work between @gandro and @brb

Follow-ups, to be done in separate PRs:

  • (follow-up) Ability to store private key persistently (as an alternative to opting out of node-to-node encryption for control-plane nodes)

@brb brb added area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/major This PR introduces major new functionality to Cilium. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. release-blocker/1.12 labels Apr 11, 2022
@brb
Copy link
Copy Markdown
Member Author

brb commented Apr 11, 2022

/test-1.23-net-next

@brb
Copy link
Copy Markdown
Member Author

brb commented Apr 12, 2022

test-1.23-net-next

@brb brb force-pushed the pr/gandro+brb/wg-host-encryption-v2 branch from f7e67e9 to 6224144 Compare April 14, 2022 09:03
@brb brb changed the title Add WireGuard host2host encryption Add WireGuard host2host and LB encryption Apr 14, 2022
@brb brb mentioned this pull request Apr 14, 2022
@gandro gandro force-pushed the pr/gandro+brb/wg-host-encryption-v2 branch from 6224144 to 1ee8cd2 Compare May 3, 2022 14:27
@github-actions

This comment was marked as resolved.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jun 3, 2022
@brb brb removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jun 3, 2022
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jul 7, 2022

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 7, 2022
@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 16, 2022
@github-actions

This comment was marked as resolved.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Aug 15, 2022
@brb brb added pinned These issues are not marked stale by our issue bot. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Aug 15, 2022
@brb brb added this to the 1.13 milestone Aug 15, 2022
@brb brb force-pushed the pr/gandro+brb/wg-host-encryption-v2 branch 2 times, most recently from 37d5c58 to f321011 Compare September 8, 2022 07:58
@brb
Copy link
Copy Markdown
Member Author

brb commented Sep 8, 2022

/test

gandro and others added 12 commits January 23, 2023 11:18
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
This commit completely changes the WireGuard integration in the
datapath to enable the host2host encryption (also, pod2host and
host2pod).

Previously, we supported only the pod2pod case. This was implemented by
marking a to be encrypted packet, and then letting the IP rule installed
in the host netns to forward the packet to the WireGuard tunnel device
"cilium_wg0" for the encryption, as shown below:

┌─────────────────┐
│   Pod A netns   │
│   ┌────────┐    │
│   │  eth0  │    │
└───┴────┬───┴────┘
    ┌────┴──────────┐
    │ bpf_lxc@veth0 │               (host netns)
    └────┬──────────┘
         │1."from-container" in bpf_lxc sets MARK_MAGIC_ENCRYPT
         │2. ip rule matches the mark and routes packet to WG netdev
         │       ┌───────────┐
         └──────►│cilium_wg0 │
                 └────┬──────┘
                      │
                  ┌───▼───┐
                  │ eth0  │
                  └───────┘

This was working fine for the pod2pod case (albeit one danger that a
sysadmin could nuke the rule making the packet to bypass the WG dev).

However, with this approach it was not possible to enable the host2host
case, as a packet originating from the host netns was never handled by
bpf_lxc. Thus, we needed to change the datapath.

To encrypt a host2host packet we need to attach bpf_host to the outgoing
device connecting cluster nodes which in the picture is "eth0". Then the
program "to-netdev" from bpf_host can forward the packet to the WG dev.
Once encrypted, the packet will be again hitting the same bpf_host
program. To avoid the packet looping forever, we can configure the WG
netdev to set the skb mark after the encryption. Then, in the program
we can skip the redirection to the WG netdev if the mark is set.

The flow below shows the new integration.

┌─────────────────┐
│   Pod A netns   │
│   ┌────────┐    │
│   │  eth0  │    │
└───┴────┬───┴────┘
    ┌────┴──────────┐
    │ bpf_lxc@veth0 │    (host netns)
    └────┬──────────┘
         │
    ┌────▼───────────┐  1. "to-netdev" does redirect    ┌───────────┐
    │ bpf_host@eth0  │─────────────────────────────────►│cilium_wg0 │
    └─┬─────────▲────┘                                  └──────┬────┘
      │         │                                              │
      │         │  2. encrypt and set MARK_MAGIC_ENCRYPTED     │
      │         └──────────────────────────────────────────────┘
      │
      │ 3. output the encrypted packet
      │
      ▼

The same flow is used for the host2host, host2pod and pod2host cases.

Another advantage of this change is that the WG can be used with the L7
proxy (the mutual exclusion check is going to be removed in a subsequent
commit).

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
This commit changes the agent code to support the new WireGuard
integration described in the previous commit.

The most important changes:

1. Configure the WG netdev to add the skb mark.
2. Add NodeIP to allowed-ips when --encrypt-node=true.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
We need to detect a direct routing dev (= one which is used to connect
K8s Nodes) in order to attach bpf_host when WG is enabled, as bpf_host
is responsible for redirecting packets to the WG netdev for encryption.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
There is no longer skb mark conflict with L7 proxy, so we can drop the
check. This means that the L7 proxy can work together with the WG
integration.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Before changing the WG integration's behavior, when running in the
tunneling mode, a pod2pod@remote-node traffic escaped the bpf_overlay's
tunneling, and was encapsulated once by the WG tunnel netdev.

To be compatible with this < v1.13 behavior, this commit adds the
redirect to the WG tunnel to the __encap_and_redirect_with_nodeid()
function which is eventually called in the pod2pod packet path.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
This commit attaches the bpf_host's "from-netdev" section to the
Cilium's WireGuard tunnel netdev ("cilium_wg0").

This is needed to enable the encryption of the KPR traffic. In
particular, we encrypt the N/S KPR requests which will be forwarded to
a remote node running a selected service endpoint.

IMPORTANT: this encrypts KPR traffic only when running in the
non-tunneling mode.

For the request path no changes are required. The existing datapath
configuration already handles it, as shown in the following:

1. The "from-netdev" attached to eth0 is invoked for the NodePort
   request.
2. A remote service endpoint is selected, the DNAT and SNAT translations
   are performed.
3. The translated request is redirected to eth0.
4. The "to-netdev" section on eth0 is invoked. It detects that the
   packet needs to encrypted, so it redirects to the cilium_wg0.

For the reply path a minimal changes were required. After the WG netdev
has decrypted the reply packet, the packet is returned to the networking
stack. Because the networking stack is not aware of the connection, the
reply packet is dropped. To avoid that, we attach the "from-netdev"
section to the WG netdev, so that the following can be performed:

1. Reverse SNAT and DNAT translations are applied to the reply.
2. The reply packet is redirected to the outgoing interface.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Currently, the bpf_overlay prog doesn't redirect a packet to the WG
netdev for encryption (will be addressed in a follow-up PR). So, in
order for the tests to pass, we need to enable the host2host encryption.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Martynas Pumputis <m@lambda.lt>
This commit introduces a new command-line option to specify a label
selector to make nodes opt-out of node-to-node encryption. The default
label selector set will match kubeadm control-plane nodes (i.e. the
nodes hosting kube-apiserver). This ensures that all Cilium-managed
nodes will be able to reach the kube-apiserver running on that node
regardless of encryption status. This is important, because we want to
ensure that nodes can change their public keys when they re-join the
cluster.

Nodes who opted out of node-to-node encryption will still perform
encryption for pod-to-pod traffic.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
This adds a new section about node-to-node encryption and removes some
obsolete limitations.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
It was hidden because it's currently not supported by IPSec. But with
the previous commits, we do now support node-to-node encryption via
WireGuard.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
The encryption tests were introduced in
cilium/cilium-cli#1308.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
@gandro gandro force-pushed the pr/gandro+brb/wg-host-encryption-v2 branch from 4a5f0b6 to 0e6e703 Compare January 23, 2023 10:28
@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 23, 2023

Rebased on master to resolve merge conflict. Re-running CI

@gandro
Copy link
Copy Markdown
Member

gandro commented Jan 23, 2023

/test

@brb
Copy link
Copy Markdown
Member Author

brb commented Jan 24, 2023

Got reviews from majority of folks. Marking as ready-to-merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. pinned These issues are not marked stale by our issue bot. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/major This PR introduces major new functionality to Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.