Skip to content

bpf: Rework ipcache to support LPM lookups.#3885

Merged
ianvernon merged 1 commit intocilium:masterfrom
joestringer:submit/cidr-l4-bpf
Apr 26, 2018
Merged

bpf: Rework ipcache to support LPM lookups.#3885
ianvernon merged 1 commit intocilium:masterfrom
joestringer:submit/cidr-l4-bpf

Conversation

@joestringer
Copy link
Copy Markdown
Member

@joestringer joestringer commented Apr 25, 2018

[Split out from #3835 for easier review]

This shuffles the ipcache keys around to be friendly to LPM-based lookup
in future. For now, we only generate /32 or /128 (IPv6) prefixes, so
map keys will look something like (IPv4 example):

{
    Prefixlen = 64
    Pad1      = 0
    Pad2      = 0
    Family    = 1
    IP        = {192, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
}

The "Prefixlen" of 64 indicates that it should always match on the 32
bits of {Pad1, Pad2, Family}, as well as an IPv4 host prefix of /32. The
length of the Prefixlen field itself is not included in the Prefixlen.

For now, we only generate /32s (or /128s) from userspace, so the
lxc_config.h generation always writes the IPCACHE[64] PREFIXES lists as
only the prefixes required to match host IPs.

We also don't support LPM for this map type yet, this can be ported over
in a similar way to how CIDR works currently. I have code in the works for
this, just need to test it out on newer kernels.

PR #1685 will modify the userspace generation further to allow
CIDR prefixes to be generated and populated into the IPCache.

LPM support will be forthcoming, it's a fairly small patch but I haven't had a chance to test it properly yet. I don't think that this feature needs to be gated on that.


This change is Reviewable

@joestringer joestringer added pending-review area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. labels Apr 25, 2018
@joestringer joestringer requested a review from a team April 25, 2018 06:35
@joestringer joestringer requested review from a team as code owners April 25, 2018 06:35
Copy link
Copy Markdown
Contributor

@raybejjani raybejjani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved for the one endpoint file

Copy link
Copy Markdown
Member

@ianvernon ianvernon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for CLI. ipcache BPF changes look fine to me, but will defer to bpf codeowners for final decision on that.

Comment thread bpf/lib/eps.h Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiously , Do we need to worry about these IPCACHE_STATIC_PREFIX value varying due to structs ipcache_key and bpf_lpm_trie_key being packed or padded differently on different systems?
Or does BPF compiler takes care of the struct padding to be consistent?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point, I think that if we want to be careful then we should probably add the packed attributes to the structure definitions for these. /cc @borkmann

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I packed the struct ipcache_key.

Copy link
Copy Markdown
Member

@borkmann borkmann Apr 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manalibhutiyani you mean different archs? We're using -target bpf, so it's 64bit everywhere no matter the underlying arch. See also the difference compared to invoking clang with native/default target (which we're not doing in Cilium). https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/bpf/bpf_devel_QA.txt#n519

@joestringer
Copy link
Copy Markdown
Member Author

I need to triage the failure and see whether it may be related.

This shuffles the ipcache keys around to be friendly to LPM-based lookup
in future. For now, we only generate /32 or /128 (IPv6) prefixes, so
map keys will look something like (IPv4 example):

{
    Prefixlen = 32
    Pad1      = 0
    Pad2      = 0
    Family    = 1
    IP        = {192, 0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
}

The "Prefixlen" of 64 indicates that it should always match on the 32
bits of {Pad1, Pad2, Family}, as well as an IPv4 host prefix of /32. The
length of the Prefixlen field itself is not included in the Prefixlen.

For now, we only generate /32s (or /128s) from userspace, so the
lxc_config.h generation always writes the IPCACHE[64] PREFIXES lists as
only the prefixes required to match host IPs.

We also don't support LPM for this map type yet, this can be ported over
in a similar way to how CIDR works currently.

Upcoming commits will modify the userspace generation further to allow
CIDR prefixes to be generated and populated into the IPCache.

Signed-off-by: Joe Stringer <joe@covalent.io>
@joestringer
Copy link
Copy Markdown
Member Author

test-me-please

@joestringer joestringer requested a review from borkmann April 26, 2018 07:08
Copy link
Copy Markdown
Member

@borkmann borkmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@joestringer joestringer added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. and removed pending-review labels Apr 26, 2018
@ianvernon ianvernon merged commit a9066fd into cilium:master Apr 26, 2018
@joestringer joestringer deleted the submit/cidr-l4-bpf branch April 26, 2018 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. ready-to-merge This PR has passed all tests and received consensus from code owners to merge.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants