bpf: Rework ipcache to support LPM lookups.#3885
Conversation
raybejjani
left a comment
There was a problem hiding this comment.
approved for the one endpoint file
ianvernon
left a comment
There was a problem hiding this comment.
Approving for CLI. ipcache BPF changes look fine to me, but will defer to bpf codeowners for final decision on that.
There was a problem hiding this comment.
Out of curiously , Do we need to worry about these IPCACHE_STATIC_PREFIX value varying due to structs ipcache_key and bpf_lpm_trie_key being packed or padded differently on different systems?
Or does BPF compiler takes care of the struct padding to be consistent?
There was a problem hiding this comment.
Interesting point, I think that if we want to be careful then we should probably add the packed attributes to the structure definitions for these. /cc @borkmann
There was a problem hiding this comment.
I packed the struct ipcache_key.
There was a problem hiding this comment.
@manalibhutiyani you mean different archs? We're using -target bpf, so it's 64bit everywhere no matter the underlying arch. See also the difference compared to invoking clang with native/default target (which we're not doing in Cilium). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/bpf/bpf_devel_QA.txt#n519
|
I need to triage the failure and see whether it may be related. |
This shuffles the ipcache keys around to be friendly to LPM-based lookup
in future. For now, we only generate /32 or /128 (IPv6) prefixes, so
map keys will look something like (IPv4 example):
{
Prefixlen = 32
Pad1 = 0
Pad2 = 0
Family = 1
IP = {192, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
}
The "Prefixlen" of 64 indicates that it should always match on the 32
bits of {Pad1, Pad2, Family}, as well as an IPv4 host prefix of /32. The
length of the Prefixlen field itself is not included in the Prefixlen.
For now, we only generate /32s (or /128s) from userspace, so the
lxc_config.h generation always writes the IPCACHE[64] PREFIXES lists as
only the prefixes required to match host IPs.
We also don't support LPM for this map type yet, this can be ported over
in a similar way to how CIDR works currently.
Upcoming commits will modify the userspace generation further to allow
CIDR prefixes to be generated and populated into the IPCache.
Signed-off-by: Joe Stringer <joe@covalent.io>
dac346e to
cfeae99
Compare
|
test-me-please |
[Split out from #3835 for easier review]
This shuffles the ipcache keys around to be friendly to LPM-based lookup
in future. For now, we only generate /32 or /128 (IPv6) prefixes, so
map keys will look something like (IPv4 example):
The "Prefixlen" of 64 indicates that it should always match on the 32
bits of {Pad1, Pad2, Family}, as well as an IPv4 host prefix of /32. The
length of the Prefixlen field itself is not included in the Prefixlen.
For now, we only generate /32s (or /128s) from userspace, so the
lxc_config.h generation always writes the IPCACHE[64] PREFIXES lists as
only the prefixes required to match host IPs.
We also don't support LPM for this map type yet, this can be ported over
in a similar way to how CIDR works currently. I have code in the works for
this, just need to test it out on newer kernels.
PR #1685 will modify the userspace generation further to allow
CIDR prefixes to be generated and populated into the IPCache.
LPM support will be forthcoming, it's a fairly small patch but I haven't had a chance to test it properly yet. I don't think that this feature needs to be gated on that.
This change is