Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

kernel: Add support for to trust the CPU's hwrng#482

Closed
mcastelino wants to merge 1 commit intokata-containers:masterfrom
mcastelino:topic/hwrng
Closed

kernel: Add support for to trust the CPU's hwrng#482
mcastelino wants to merge 1 commit intokata-containers:masterfrom
mcastelino:topic/hwrng

Conversation

@mcastelino
Copy link
Copy Markdown
Contributor

To accelerate firecracker bootup enable support for hwrng

https://lwn.net/Articles/760121/

User will still be able to override via kata configration.toml if
desired random.trust_cpu=on|off

Fixes: #481

Signed-off-by: Manohar Castelino manohar.r.castelino@intel.com

To accelerate firecracker bootup enable support for hwrng

https://lwn.net/Articles/760121/

User will still be able to override via kata configration.toml if
desired random.trust_cpu=on|off

Fixes: kata-containers#481

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
@mcastelino mcastelino requested review from egernst and grahamwhaley May 2, 2019 19:26
@devimc
Copy link
Copy Markdown

devimc commented May 2, 2019

cc @grahamwhaley

@jcvenegas
Copy link
Copy Markdown
Member

/test

Copy link
Copy Markdown
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
I'll add this to the fragment PR as well.
@Pennyzct @alicefr @tuan-hoang1 @nitkon - just fyi in for consideration on other arch's

@grahamwhaley
Copy link
Copy Markdown
Contributor

Interestingly, looks like the entropy test failed for 18.04 CI - which feels like it could be related @mcastelino @GabyCT

cd integration/entropy && \
bats entropy_test.bats
1..1
not ok 1 check entropy level
# (in test file entropy_test.bats, line 59)
#   `[ "$output" -ge ${ENTROPY_LEVEL} ]' failed
# 
# 958
# 
Makefile:140: recipe for target 'entropy' failed
make: *** [entropy] Error 1
Build step 'Execute shell' marked build as failure
Performing Post build task...

Copy link
Copy Markdown

@jodh-intel jodh-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mcastelino - the patch itself looks fine, but this feature is also available for ppc64le and s390x so shouldn't we also update those configs?

/cc @nitkon, @alicefr.

@nitkon
Copy link
Copy Markdown
Contributor

nitkon commented May 3, 2019

@jodh-intel: I can see # CONFIG_RANDOM_TRUST_CPU is not set in the powerpc kernel config 4.19.x. However, I don't think firecracker is supported on Power. Should it still be enabled?

@jodh-intel
Copy link
Copy Markdown

I'd forgotten that ;)

@mcastelino
Copy link
Copy Markdown
Contributor Author

Interestingly, looks like the entropy test failed for 18.04 CI - which feels like it could be related @mcastelino @GabyCT

@GabyCT @grahamwhaley this is a nice CI test. Also I need to test this as scale. If launching a large number of kata containers actually get impacted by this, as this will get entropy from h/w. It may actually end up causing slowdown at scale.

@grahamwhaley do we test entropy at scale as part of scale testing. I meant launch say 100 containers in parallel to detect any entropy related lockup delaying container launch.

/cc @egernst

@grahamwhaley
Copy link
Copy Markdown
Contributor

@mcastelino We don't explicitly do any at-scale entropy tests that I am aware of, not. Notes:

  • Yes, in the past I wondered if doing large scale deployments would drain the entropy pool and impact the container boots. I would expect to see that in our metrics boot report/tests - the average boot should go up, so the metrics CI should fail, and you would see the boot time graph slope in the metrics report. I had Gaby check this when the entropy test was written, but we didn't see/find the effect I expected to maybe see
  • we don't have any at-scale tests in the CI afaik (yet?)
  • bet we were discussing if we could set up at-least per-pre-release at-scale tests (the key thing being at-scale tests may require substantial hardware resource and time to run, so per-CI run may not be practical, but per-release candidate might).

@grahamwhaley
Copy link
Copy Markdown
Contributor

I've had a look at the test, the kernel code for CONFIG_RANDOM_TRUST_CPU and a peek around haveged, which we use to to maintain the entropy pool on our CI machines - I can't see an obvious way how CONFIG_RANDOM_TRUST would effect the available entropy, and thus make the test fail. However, this whole area is quite tangled :-)

I have one idea that may be a 'fix', and I'll see if I can re-create and test that locally. I can't help feeling our test is a bit 'fast' at checking the entropy inside the VM, and maybe it just needs to be given a few seconds for haveged to get up and running and do its job...

Any extra thoughts welcome here :-)

@grahamwhaley
Copy link
Copy Markdown
Contributor

@mcastelino I did a bit more testing, and thinking...
I have a feeling this test, and why it is failing, might all be to do with timing... and entropy gathering in the kernel boot...
Originally the test was added when we added the virtiorng driver: kata-containers/runtime#445
I think there was a wrong assumption made there though, that the addition of the virtiorng would add more entropy to the container kernel - I'm not sure that is the correct statement per se. That is, I don't think the VM/container inherits the entropy from the host - it inherits the ability to access the random numbers, which then gather up into entropy over time. That time I think is the key. With the addition of the RANDOM_TRUST config in the kernel, I believe we know boot faster, so by the time we get to the workload, we have less entropy available.

I also have a feeling the fact that in the container we were getting ~1000 units of entropy, and on the host haveged tries to maintain something ~1000 units of entropy - are just coincidence.

I'm not convinced checking for available entropy immediately after boot inside the container is a useful test. the entropy may not have gathered by then. We may have to write a smarter test that does something like drains all the entropy and then measures how long it takes to come back. That might be a more realistic test for 'how good is our entropy pool source in the container'?
wdyt?

I measure the entropy on my host and what the test was seeing on my local system, both with and without this RANDOM_TRUST setting.
Without the setting I get just over 1000 units of entropy. With it I get just under. If I add a sleep 30 in the container before doing the check I get nearly 1000. If I sleep 60 I get over 1000. But, as per above, I suspect that expecting to have 1000 units of entropy immediately available in the container on boot is a red herring 🐟

wdyt?
btw, if I stick a sleep 30 or sleep 60

@mcastelino
Copy link
Copy Markdown
Contributor Author

That time I think is the key. With the addition of the RANDOM_TRUST config in the kernel, I believe we know boot faster, so by the time we get to the workload, we have less entropy available.

@grahamwhaley so the faster boot has the side effect of reducing available entropy. I agree we need to have a test that drains and checks how long it takes for entropy to build up to an acceptable level on an otherwise container.

@ganeshmaharaj also noticed drops in entropy even with QEMU with come of the kernel config changes.

@grahamwhaley
Copy link
Copy Markdown
Contributor

I think @ganeshmaharaj noticed over on my fragments pr, where I added your RANDOM_TRUST fragment in in my last update ;-) But, yes, I think we need to re-assess how we measure our entropy source.
I had a quick look around, and I suspect we don't need to measure how 'good' the entropy source is (it in fact seems quite hard to do that from userland), but to measure how long it takes to 'fill up'. A simple time dd if=/dev/[u]random of=/dev/null bs=4b count=1000000 or similar might be good enough?, to see how long it takes to fulfill the blocking reads from the kernel CRNG/PRNG generator that is relying on the entropy pool - you think?

GabyCT added a commit to GabyCT/tests-1 that referenced this pull request May 8, 2019
We need to wait a couple of seconds in order to get the
entropy level kata-containers/packaging#482.

Fixes kata-containers#1546

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
@grahamwhaley
Copy link
Copy Markdown
Contributor

@mcastelino @GabyCT @jcvenegas - I thought I'd have a look-see at how fast random numbers can be supplied by either /dev/random or /dev/urandom, inside containers or on the bare metal host, so we get a view at the scene....

command:

$ time dd if=/dev/[u]random of=/dev/null bs=4b count=100000

Note: host has haveged running replenishing the random/entropy pool.

Note: kata uses the virtio-rng device, fed from the host urandom.

what random urandom
metal 3.6s 18.25s
runc 3.6s 18.3s
kata 6.3s 1.19s

Notes then:

  • runc performs pretty much the same as the host, which is pretty much what I'd expect.
  • I don't understand how/why urandom is so much faster inside Kata than on the host?
  • As we feed the Kata virtio-rng from the host /dev/urandom, afaict, accessing /dev/random (which can block...) inside a Kata container, does not deplete the host side entropy (as it is not draining from the host /dev/random), and thus may be faster than the host/runc (with all the caveats that come or not with sourcing randomness from /dev/urandom).
  • If we've not clearly documented that Kata hands out /dev/[u]random data partly sourced from the host /dev/urandom, then we probably should.
  • I guess the /dev/random inside Kata is slower as it has to hop out across the virtio-rng to source data, and I think somebody noted that virtio-rng may also be rate limited. It's thus maybe no surprise it is slower, but it is not terrible :-)

@GabyCT - I suspect we can thus write our 'entropy' (or rather, probably better named something like 'random source') test by doing a time test on getting random numbers from the (potentially blocking) /dev/random inside the container - and probably presume (we'll find out) that if we cannot source 100000 (100k) 4byte elements in <20s, then we have failed. We might want to see if there is a nice way to put some timeout on the read, and we might want to test on a kata that either has no virtio-rng or sources virtio-rng from the host /dev/random, and check what happens in that case....

@jodh-intel
Copy link
Copy Markdown

Branch is conflicted (version file).

@egernst
Copy link
Copy Markdown
Member

egernst commented Jun 11, 2019

Eric-stale-bot: This hasn't had much activity. AFAICT, this should be marked do-no-merge until we have appropriate test results in place for running on a single node with a scaled number of kata pods in order to measure the impact.

I don't think @mcastelino will have cycles to do this -- I'll also add the help-wanted label.

@egernst egernst added needs-help Request for extra help (technical, resource, etc) do-not-merge PR has problems or depends on another needs-rebase PR contains conflicts which need resolving labels Jun 11, 2019
@jodh-intel
Copy link
Copy Markdown

Ping @kata-containers/packaging - anyone have cycles to look at this?

@grahamwhaley
Copy link
Copy Markdown
Contributor

From reading the history and browsing the code, I think the correct way (for x86 at least) is to add the RANDOM_TRUST_CPU option to the kernel fragments. We may want to decide if we default to using the hwrng to on/off on the kernel command line though - that config option was added to the kernel for a reason ...

@fidencio
Copy link
Copy Markdown
Member

fidencio commented May 3, 2021

As this PR is marked as do-not-merge, we're assuming it won't make for the final 1.13.0 release, thus we're closing it.
In case you feel this has been wrongly closed, please, feel free to re-open it.

@fidencio fidencio closed this May 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

do-not-merge PR has problems or depends on another needs-help Request for extra help (technical, resource, etc) needs-rebase PR contains conflicts which need resolving

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kernel: Add support for to trust the CPU's hwrng

8 participants