Add initial support for rsvd accounting hugetlb cgroup 

The previous non-rsvd max/limit_in_bytes does not account for reserved
huge page memory, making it possible for a processes to reserve all the
huge page memory, without being able to allocate it (due to cgroup
restrictions).

In practice this makes it possible to successfully mmap more huge page
memory than allowed via the cgroup settings, but when using the memory
the process will get a SIGBUS and crash. This is bad for applications
trying to mmap at startup (and it succeeds), but the program crashes
when starting to use the memory. eg. postgres is doing this by default.

This has lead to strange segfaults like these: https://github.com/zalando/patroni/issues/1393

More info can be found here: https://lkml.org/lkml/2020/2/3/1153

In order to solve this, I think we have to main ways to do it:
- Add writes (when supported) to rsvd for the current `hugepageLimits` found [here](https://github.com/opencontainers/runtime-spec/blob/master/config-linux.md#huge-page-limits). Silently ignore when rsvd is not supported.
- Add another element called something like `hugepageLimitsRsvd` to enforce the rsvd. value, silently fail or return error when rsvd is not supported.


I lean toward the first approach, since adding a new item makes it harder to understand, and may lead into "bad" implementations, but am an not sure at all. The pro for the last one, for having a separate entity is that it is then up to the user of the runtime to decide, giving the "user" a full choice, even tho. i see no real reason to enforce the "old" value and not the reserved one. The current behavior makes a cgroup limited process able to reserve all the huge page memory available on a node, making it inaccessible to others. 

No matter the decition, we should then update the config-linux.md docs to clarify how it should work.

Any thoughts?

Simple WIP in runc to add support for enforcing it using the `hugepageLimits` is here: https://github.com/opencontainers/runc/pull/2360/files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial support for rsvd accounting hugetlb cgroup #1050

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add initial support for rsvd accounting hugetlb cgroup #1050

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions