You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are several issues related to "who is using CPU / RAM / Disk space" on a sled, and ensuring these are accounted properly. This issue attempts to summarize these issues at a high-level, and track them in one place.
If we do not account for these resources, it's possible for anything consuming these resources in an unbounded fashion to exhaust other consumers of said resource on the sled, which could result in unexpected failures or performance degredation.
Definitions
Resources
Resources are finite resources that exist on Sleds, and are necessary for zones to operates. They are used by zones, but may be used by other non-zone entities as well.
These include:
CPUs
RAM (both reservoir and non-reservoir usage)
Disk Space on datasets (used by both durable datasets and transient zone filesystems). The focus of this issue is on "disk space usage within U.2s specifically". However, there are many m.2 datasets, which can also run out of space. We've observed this before, with core files filling up the crash dataset.
Resource Consumers
Consumers are entities on Sleds that utilize resources, and draw from the "shared pool" of them.
These include:
The Host OS + Global Zone
All control-plane zones allocated from the blueprint (Nexus, Crucible, Cockroachdb, DNS, NTP, etc)
All control-plane zones that are allocated from the sled (Switch Zone)
All control-plane zones that are allocated on-demand (Propolis, Probe Zones)
Why Use This Categorization
If you pick a consumer (e.g. "Switch Zone") and a resource (e.g., "RAM"), and there exists no upper-bound on usage, then it is possible for that consumer to negatively impact other occupants on that sled by preventing them from using resources that they expect to exist.
To definitely resolve this issue, we must define "buckets" from which consumers can access resources. One such example: For disks, the Debug dataset has a reservation and a quota. Although we must account for space usage within this dataset, it is not possible for usage within this dataset to cause problems in other datasets. Similarly, usage of space by other datasets cannot starve the debug dataset of space.
Tools to limit resource usage
illumos gives us tools for providing upper-bounds on the usage of resources by consumers
CPUs / RAM usage can be controlled by the capped-cpu and capped-memory properties of zonecfg
Disk Usage: The host OS/GZ generally make usage of the rpool ramdisk, as well as M.2s. They do not really have any unbounded space usage on U.2 disks.
CPU + RAM usage: Unbounded. It may be difficult to make an explicit bound here, and easier to "put a capacity on all other zones" instead. Whatever pool we allocate from, the remainder would then be dedicated to the host OS.
Disk Usage from durable filesystem: Nexus considers and updates a "size used" column within each Crucible dataset, and tries to make this usage less than the size of the entire zpool. However, this size is considered relative to the entire zpool (where other datasets may exist!), and nothing accounts for the metadata used by Crucible itself (in addition to the user-requested storage). Furthermore, there is no bound set on the durable dataset itself. However, Crucible does have an internal dataset called regions, where it does apply quotas and reservations internally.
CPU/RAM usage: Unbounded
Crucible Pantry
Disk Usage from transient filesystem: Unbounded
CPU/RAM usage: Unbounded
Internal + External DNS
Disk Usage from transient filesystem: Unbounded
Disk Usage from durable filesystem: Unbounded
CPU/RAM usage: Unbounded
Nexus
Disk Usage from transient filesystem: Unbounded
CPU/RAM usage: Unbounded
Oximeter
Disk Usage from transient filesystem: Unbounded
CPU/RAM usage: Unbounded
Sled-Agent Controlled Zones
Switch Zone
RAM Usage from transient filesystem: Unbounded. Take note! This is one of the few zones to occupy the ramdisk, so the transient filesystem is a consumer of RAM, not U.2 space.
CPU/RAM usage: Unbounded
Dynamically-Provisioned Zones
Propolis (Note that the Propolis zone must consume resources in addition to the bounds on the underlying instance)
Disk Usage from transient filesystem: Unbounded
CPU usage: We supply a value of vcpus for the instance, but impose no bound on the Propolis Zone.
Reservoir RAM usage: We give values for memory which is allocated within Nexus, and used by Propolis to cooperatively use a portion of an instance-provisioned "memory reservoir". Reservoir capacity is cooperatively shared by the propolis zones (the sled agent is trusting zones to only consume as much reservoir as they were provided - a "greedy" propolis could starve other instances, although this seems unlikely).
Non-Reservoir RAM usage: We do not consider this amount (we pretend this usage is zero in the allocation, but that is not true) and then don't bound it.
(Note that the "consumer" here does not necessarily imply blame - if a neighboring service has consumed excessive resources, a consumer may fail prematurely)
There are several issues related to "who is using CPU / RAM / Disk space" on a sled, and ensuring these are accounted properly. This issue attempts to summarize these issues at a high-level, and track them in one place.
If we do not account for these resources, it's possible for anything consuming these resources in an unbounded fashion to exhaust other consumers of said resource on the sled, which could result in unexpected failures or performance degredation.
Definitions
Resources
Resources are finite resources that exist on Sleds, and are necessary for zones to operates. They are used by zones, but may be used by other non-zone entities as well.
These include:
Resource Consumers
Consumers are entities on Sleds that utilize resources, and draw from the "shared pool" of them.
These include:
Why Use This Categorization
If you pick a consumer (e.g. "Switch Zone") and a resource (e.g., "RAM"), and there exists no upper-bound on usage, then it is possible for that consumer to negatively impact other occupants on that sled by preventing them from using resources that they expect to exist.
To definitely resolve this issue, we must define "buckets" from which consumers can access resources. One such example: For disks, the Debug dataset has a reservation and a quota. Although we must account for space usage within this dataset, it is not possible for usage within this dataset to cause problems in other datasets. Similarly, usage of space by other datasets cannot starve the debug dataset of space.
Tools to limit resource usage
illumos gives us tools for providing upper-bounds on the usage of resources by consumers
capped-cpuandcapped-memoryproperties of zonecfgResource Limits by Consumer
regions, where it does apply quotas and reservations internally.vcpusfor the instance, but impose no bound on the Propolis Zone.memorywhich is allocated within Nexus, and used by Propolis to cooperatively use a portion of an instance-provisioned "memory reservoir". Reservoir capacity is cooperatively shared by the propolis zones (the sled agent is trusting zones to only consume as much reservoir as they were provided - a "greedy" propolis could starve other instances, although this seems unlikely).Issues
(Note that the "consumer" here does not necessarily imply blame - if a neighboring service has consumed excessive resources, a consumer may fail prematurely)
saga+saga_node_eventtables are never garbage collected, but they should be #6635