Skip to content

Latest commit

 

History

History
663 lines (525 loc) · 26.2 KB

File metadata and controls

663 lines (525 loc) · 26.2 KB
 
Mar 2, 2017
Mar 2, 2017
1
# <a name="linuxContainerConfiguration" />Linux Container Configuration
Jun 25, 2015
Jun 25, 2015
2
May 2, 2016
May 2, 2016
3
This document describes the schema for the [Linux-specific section](config.md#platform-specific-configuration) of the [container configuration](config.md).
Aug 12, 2016
Aug 12, 2016
4
The Linux container specification uses various kernel features like namespaces, cgroups, capabilities, LSM, and filesystem jails to fulfill the spec.
Jun 30, 2015
Jun 30, 2015
5
Mar 2, 2017
Mar 2, 2017
6
## <a name="configLinuxDefaultFilesystems" />Default Filesystems
Sep 9, 2015
Sep 9, 2015
7
8
The Linux ABI includes both syscalls and several special file paths.
May 14, 2017
May 14, 2017
9
Applications expecting a Linux environment will very likely expect these file paths to be set up correctly.
Sep 9, 2015
Sep 9, 2015
10
Jan 23, 2017
Jan 23, 2017
11
The following filesystems SHOULD be made available in each container's filesystem:
Jan 27, 2016
Jan 27, 2016
12
Mar 3, 2017
Mar 3, 2017
13
| Path | Type |
Jan 27, 2016
Jan 27, 2016
14
| -------- | ------ |
May 15, 2017
May 15, 2017
15
| /proc | [procfs][] |
16
| /sys | [sysfs][] |
17
| /dev/pts | [devpts][] |
18
| /dev/shm | [tmpfs][] |
Jan 27, 2016
Jan 27, 2016
19
Mar 2, 2017
Mar 2, 2017
20
## <a name="configLinuxNamespaces" />Namespaces
Jan 27, 2016
Jan 27, 2016
21
22
A namespace wraps a global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.
23
Changes to the global resource are visible to other processes that are members of the namespace, but are invisible to other processes.
Mar 3, 2017
Mar 3, 2017
24
For more information, see the [namespaces(7)][namespaces.7_2] man page.
Jan 27, 2016
Jan 27, 2016
25
26
Namespaces are specified as an array of entries inside the `namespaces` root field.
May 14, 2017
May 14, 2017
27
The following parameters can be specified to set up namespaces:
Jan 27, 2016
Jan 27, 2016
28
Oct 27, 2016
Oct 27, 2016
29
* **`type`** *(string, REQUIRED)* - namespace type. The following namespace types are supported:
Jun 3, 2016
Jun 3, 2016
30
* **`pid`** processes inside the container will only be able to see other processes inside the same container.
31
* **`network`** the container will have its own network stack.
32
* **`mount`** the container will have an isolated mount table.
33
* **`ipc`** processes inside the container will only be able to communicate to other processes inside the same container via system level IPC.
34
* **`uts`** the container will be able to have its own hostname and domain name.
35
* **`user`** the container will be able to remap user and group IDs from the host to local users and groups within the container.
36
* **`cgroup`** the container will have an isolated view of the cgroup hierarchy.
Jan 27, 2016
Jan 27, 2016
37
May 9, 2017
May 9, 2017
38
* **`path`** *(string, OPTIONAL)* - an absolute path to namespace file in the [runtime mount namespace](glossary.md#runtime-namespace).
39
The runtime MUST place the container process in the namespace associated with that `path`.
40
The runtime MUST [generate an error](runtime.md#errors) if `path` is not associated with a namespace of type `type`.
41
42
If `path` is not specified, the runtime MUST create a new [container namespace](glossary.md#container-namespace) of type `type`.
Jan 27, 2016
Jan 27, 2016
43
Aug 24, 2016
Aug 24, 2016
44
If a namespace type is not specified in the `namespaces` array, the container MUST inherit the [runtime namespace](glossary.md#runtime-namespace) of that type.
May 9, 2017
May 9, 2017
45
If a `namespaces` field contains duplicated namespaces with same `type`, the runtime MUST [generate an error](runtime.md#errors).
Jan 27, 2016
Jan 27, 2016
46
May 16, 2017
May 16, 2017
47
### Example
Jan 27, 2016
Jan 27, 2016
48
49
```json
50
"namespaces": [
51
{
52
"type": "pid",
53
"path": "/proc/1234/ns/pid"
54
},
55
{
56
"type": "network",
57
"path": "/var/run/netns/neta"
58
},
59
{
60
"type": "mount"
61
},
62
{
63
"type": "ipc"
64
},
65
{
66
"type": "uts"
67
},
68
{
69
"type": "user"
Jun 3, 2016
Jun 3, 2016
70
},
71
{
72
"type": "cgroup"
Jan 27, 2016
Jan 27, 2016
73
}
74
]
75
```
76
Mar 2, 2017
Mar 2, 2017
77
## <a name="configLinuxUserNamespaceMappings" />User namespace mappings
Jan 27, 2016
Jan 27, 2016
78
Oct 27, 2016
Oct 27, 2016
79
**`uidMappings`** (array of objects, OPTIONAL) describes the user namespace uid mappings from the host to the container.
80
**`gidMappings`** (array of objects, OPTIONAL) describes the user namespace gid mappings from the host to the container.
81
Oct 28, 2016
Oct 28, 2016
82
Each entry has the following structure:
Oct 27, 2016
Oct 27, 2016
83
Feb 27, 2017
Feb 27, 2017
84
* **`hostID`** *(uint32, REQUIRED)* - is the starting uid/gid on the host to be mapped to *containerID*.
85
* **`containerID`** *(uint32, REQUIRED)* - is the starting uid/gid in the container.
86
* **`size`** *(uint32, REQUIRED)* - is the number of ids to be mapped.
Oct 27, 2016
Oct 27, 2016
87
88
The runtime SHOULD NOT modify the ownership of referenced filesystems to realize the mapping.
Feb 22, 2017
Feb 22, 2017
89
Note that the number of mapping entries MAY be limited by the [kernel][user-namespaces].
Oct 27, 2016
Oct 27, 2016
90
May 16, 2017
May 16, 2017
91
### Example
Jan 27, 2016
Jan 27, 2016
92
93
```json
94
"uidMappings": [
95
{
96
"hostID": 1000,
97
"containerID": 0,
Nov 8, 2016
Nov 8, 2016
98
"size": 32000
Jan 27, 2016
Jan 27, 2016
99
}
100
],
101
"gidMappings": [
102
{
103
"hostID": 1000,
104
"containerID": 0,
Nov 8, 2016
Nov 8, 2016
105
"size": 32000
Jan 27, 2016
Jan 27, 2016
106
}
107
]
108
```
109
Mar 2, 2017
Mar 2, 2017
110
## <a name="configLinuxDevices" />Devices
Jan 27, 2016
Jan 27, 2016
111
Sep 30, 2016
Sep 30, 2016
112
**`devices`** (array of objects, OPTIONAL) lists devices that MUST be available in the container.
Jun 6, 2017
Jun 6, 2017
113
The runtime MAY supply them however it likes (with [`mknod`][mknod.2], by bind mounting from the runtime mount namespace, using symlinks, etc.).
Jan 27, 2016
Jan 27, 2016
114
Oct 28, 2016
Oct 28, 2016
115
Each entry has the following structure:
Jan 27, 2016
Jan 27, 2016
116
Sep 18, 2016
Sep 18, 2016
117
* **`type`** *(string, REQUIRED)* - type of device: `c`, `b`, `u` or `p`.
Jun 1, 2017
Jun 1, 2017
118
More info in [mknod(1)][mknod.1].
Sep 18, 2016
Sep 18, 2016
119
* **`path`** *(string, REQUIRED)* - full path to device inside container.
Jun 1, 2017
Jun 1, 2017
120
If a [file][] already exists at `path` that does not match the requested device, the runtime MUST generate an error.
Feb 27, 2017
Feb 27, 2017
121
* **`major, minor`** *(int64, REQUIRED unless `type` is `p`)* - [major, minor numbers][devices] for the device.
Sep 18, 2016
Sep 18, 2016
122
* **`fileMode`** *(uint32, OPTIONAL)* - file mode for the device.
Jun 1, 2017
Jun 1, 2017
123
You can also control access to devices [with cgroups](#device-whitelist).
Sep 18, 2016
Sep 18, 2016
124
* **`uid`** *(uint32, OPTIONAL)* - id of device owner.
125
* **`gid`** *(uint32, OPTIONAL)* - id of device group.
Jan 27, 2016
Jan 27, 2016
126
Jan 12, 2017
Jan 12, 2017
127
The same `type`, `major` and `minor` SHOULD NOT be used for multiple devices.
128
May 16, 2017
May 16, 2017
129
### Example
Jan 27, 2016
Jan 27, 2016
130
131
```json
Jun 1, 2017
Jun 1, 2017
132
"devices": [
Jan 27, 2016
Jan 27, 2016
133
{
Jan 27, 2016
Jan 27, 2016
134
"path": "/dev/fuse",
Jan 27, 2016
Jan 27, 2016
135
"type": "c",
Jan 27, 2016
Jan 27, 2016
136
"major": 10,
137
"minor": 229,
Feb 23, 2016
Feb 23, 2016
138
"fileMode": 438,
Jan 27, 2016
Jan 27, 2016
139
"uid": 0,
140
"gid": 0
141
},
142
{
Jan 27, 2016
Jan 27, 2016
143
"path": "/dev/sda",
144
"type": "b",
145
"major": 8,
Jan 27, 2016
Jan 27, 2016
146
"minor": 0,
Feb 23, 2016
Feb 23, 2016
147
"fileMode": 432,
Jan 27, 2016
Jan 27, 2016
148
"uid": 0,
149
"gid": 0
150
}
151
]
152
```
153
May 16, 2017
May 16, 2017
154
### <a name="configLinuxDefaultDevices" />Default Devices
Jan 27, 2016
Jan 27, 2016
155
156
In addition to any devices configured with this setting, the runtime MUST also supply:
157
158
* [`/dev/null`][null.4]
159
* [`/dev/zero`][zero.4]
160
* [`/dev/full`][full.4]
161
* [`/dev/random`][random.4]
162
* [`/dev/urandom`][random.4]
163
* [`/dev/tty`][tty.4]
May 14, 2017
May 14, 2017
164
* [`/dev/console`][console.4] is set up if terminal is enabled in the config by bind mounting the pseudoterminal slave to /dev/console.
Jan 27, 2016
Jan 27, 2016
165
* [`/dev/ptmx`][pts.4].
166
A [bind-mount or symlink of the container's `/dev/pts/ptmx`][devpts].
167
Mar 2, 2017
Mar 2, 2017
168
## <a name="configLinuxControlGroups" />Control groups
Jan 27, 2016
Jan 27, 2016
169
170
Also known as cgroups, they are used to restrict resource usage for a container and handle device access.
Jul 21, 2016
Jul 21, 2016
171
cgroups provide controls (through controllers) to restrict cpu, memory, IO, pids and network for the container.
Jan 28, 2016
Jan 28, 2016
172
For more information, see the [kernel cgroups documentation][cgroup-v1].
Jan 27, 2016
Jan 27, 2016
173
May 26, 2017
May 26, 2017
174
### <a name="configLinuxCgroupsPath" />Cgroups Path
175
176
**`cgroupsPath`** (string, OPTIONAL) path to the cgroups.
177
It can be used to either control the cgroups hierarchy for containers or to run a new process in an existing container.
178
179
The value of `cgroupsPath` MUST be either an absolute path or a relative path.
180
* In the case of an absolute path (starting with `/`), the runtime MUST take the path to be relative to the cgroups mount point.
181
* In the case of a relative path (not starting with `/`), the runtime MAY interpret the path relative to a runtime-determined location in the cgroups hierarchy.
182
183
If the value is specified, the runtime MUST consistently attach to the same place in the cgroups hierarchy given the same value of `cgroupsPath`.
184
If the value is not specified, the runtime MAY define the default cgroups path.
Jul 22, 2016
Jul 22, 2016
185
Runtimes MAY consider certain `cgroupsPath` values to be invalid, and MUST generate an error if this is the case.
186
Jan 27, 2016
Jan 27, 2016
187
Implementations of the Spec can choose to name cgroups in any manner.
188
The Spec does not include naming schema for cgroups.
Jul 21, 2016
Jul 21, 2016
189
The Spec does not support per-controller paths for the reasons discussed in the [cgroupv2 documentation][cgroup-v2].
Jan 27, 2016
Jan 27, 2016
190
The cgroups will be created if they don't exist.
191
Jul 21, 2016
Jul 21, 2016
192
You can configure a container's cgroups via the `resources` field of the Linux configuration.
193
Do not specify `resources` unless limits have to be updated.
194
For example, to run a new process in an existing container without updating limits, `resources` need not be specified.
195
May 10, 2017
May 10, 2017
196
Runtimes MAY attach the container process to additional cgroup controllers beyond those necessary to fulfill the `resources` settings.
Jul 22, 2016
Jul 22, 2016
197
May 16, 2017
May 16, 2017
198
### Example
Jan 27, 2016
Jan 27, 2016
199
200
```json
Jun 1, 2017
Jun 1, 2017
201
"cgroupsPath": "/myRuntime/myContainer",
202
"resources": {
203
"memory": {
204
"limit": 100000,
205
"reservation": 200000
206
},
207
"devices": [
208
{
209
"allow": false,
210
"access": "rwm"
211
}
212
]
Jul 22, 2016
Jul 22, 2016
213
}
Jan 27, 2016
Jan 27, 2016
214
```
215
May 16, 2017
May 16, 2017
216
### <a name="configLinuxDeviceWhitelist" />Device whitelist
Jan 27, 2016
Jan 27, 2016
217
Sep 30, 2016
Sep 30, 2016
218
**`devices`** (array of objects, OPTIONAL) configures the [device whitelist][cgroup-v1-devices].
Jan 27, 2016
Jan 27, 2016
219
The runtime MUST apply entries in the listed order.
220
Oct 28, 2016
Oct 28, 2016
221
Each entry has the following structure:
Jan 27, 2016
Jan 27, 2016
222
Sep 18, 2016
Sep 18, 2016
223
* **`allow`** *(boolean, REQUIRED)* - whether the entry is allowed or denied.
Sep 18, 2016
Sep 18, 2016
224
* **`type`** *(string, OPTIONAL)* - type of device: `a` (all), `c` (char), or `b` (block).
Jun 1, 2017
Jun 1, 2017
225
Unset values mean "all", mapping to `a`.
Sep 18, 2016
Sep 18, 2016
226
* **`major, minor`** *(int64, OPTIONAL)* - [major, minor numbers][devices] for the device.
Jun 1, 2017
Jun 1, 2017
227
Unset values mean "all", mapping to [`*` in the filesystem API][cgroup-v1-devices].
Sep 18, 2016
Sep 18, 2016
228
* **`access`** *(string, OPTIONAL)* - cgroup permissions for device.
Jun 1, 2017
Jun 1, 2017
229
A composition of `r` (read), `w` (write), and `m` (mknod).
Jan 27, 2016
Jan 27, 2016
230
May 16, 2017
May 16, 2017
231
#### Example
Jan 27, 2016
Jan 27, 2016
232
233
```json
Jun 1, 2017
Jun 1, 2017
234
"devices": [
Jan 27, 2016
Jan 27, 2016
235
{
236
"allow": false,
237
"access": "rwm"
238
},
239
{
240
"allow": true,
241
"type": "c",
242
"major": 10,
243
"minor": 229,
244
"access": "rw"
245
},
246
{
247
"allow": true,
248
"type": "b",
249
"major": 8,
250
"minor": 0,
251
"access": "r"
252
}
253
]
254
```
255
May 16, 2017
May 16, 2017
256
### <a name="configLinuxMemory" />Memory
Jan 27, 2016
Jan 27, 2016
257
Sep 30, 2016
Sep 30, 2016
258
**`memory`** (object, OPTIONAL) represents the cgroup subsystem `memory` and it's used to set limits on the container's memory usage.
May 15, 2017
May 15, 2017
259
For more information, see the kernel cgroups documentation about [memory][cgroup-v1-memory].
Jan 27, 2016
Jan 27, 2016
260
Jun 23, 2017
Jun 23, 2017
261
Values for memory specify the limit in bytes, or `-1` for unlimited memory.
262
263
* **`limit`** *(int64, OPTIONAL)* - sets limit of memory usage
264
* **`reservation`** *(int64, OPTIONAL)* - sets soft limit of memory usage
265
* **`swap`** *(int64, OPTIONAL)* - sets limit of memory+Swap usage
266
* **`kernel`** *(int64, OPTIONAL)* - sets hard limit for kernel memory
267
* **`kernelTCP`** *(int64, OPTIONAL)* - sets hard limit for kernel TCP buffer memory
268
Jul 12, 2017
Jul 12, 2017
269
The following properties do not specify memory limits, but are covered by the `memory` controller:
Jan 27, 2016
Jan 27, 2016
270
Sep 18, 2016
Sep 18, 2016
271
* **`swappiness`** *(uint64, OPTIONAL)* - sets swappiness parameter of vmscan (See sysctl's vm.swappiness)
Jul 12, 2017
Jul 12, 2017
272
The values are from 0 to 100. Higher means more swappy.
273
* **`disableOOMKiller`** *(bool, OPTIONAL)* - enables or disables the OOM killer.
274
If enabled (`false`), tasks that attempt to consume more memory than they are allowed are immediately killed by the OOM killer.
275
The OOM killer is enabled by default in every cgroup using the `memory` subsystem.
276
To disable it, specify a value of `true`.
Jan 27, 2016
Jan 27, 2016
277
May 16, 2017
May 16, 2017
278
#### Example
Jan 27, 2016
Jan 27, 2016
279
280
```json
281
"memory": {
Apr 11, 2016
Apr 11, 2016
282
"limit": 536870912,
283
"reservation": 536870912,
284
"swap": 536870912,
Jun 23, 2017
Jun 23, 2017
285
"kernel": -1,
286
"kernelTCP": -1,
Jul 12, 2017
Jul 12, 2017
287
"swappiness": 0,
288
"disableOOMKiller": false
Jan 27, 2016
Jan 27, 2016
289
}
290
```
291
May 16, 2017
May 16, 2017
292
### <a name="configLinuxCPU" />CPU
Jan 27, 2016
Jan 27, 2016
293
Sep 30, 2016
Sep 30, 2016
294
**`cpu`** (object, OPTIONAL) represents the cgroup subsystems `cpu` and `cpusets`.
May 15, 2017
May 15, 2017
295
For more information, see the kernel cgroups documentation about [cpusets][cgroup-v1-cpusets].
Jan 27, 2016
Jan 27, 2016
296
May 14, 2017
May 14, 2017
297
The following parameters can be specified to set up the controller:
Jan 27, 2016
Jan 27, 2016
298
Sep 18, 2016
Sep 18, 2016
299
* **`shares`** *(uint64, OPTIONAL)* - specifies a relative share of CPU time available to the tasks in a cgroup
Jan 5, 2017
Jan 5, 2017
300
* **`quota`** *(int64, OPTIONAL)* - specifies the total amount of time in microseconds for which all tasks in a cgroup can run during one period (as defined by **`period`** below)
Sep 18, 2016
Sep 18, 2016
301
* **`period`** *(uint64, OPTIONAL)* - specifies a period of time in microseconds for how regularly a cgroup's access to CPU resources should be reallocated (CFS scheduler only)
Jan 5, 2017
Jan 5, 2017
302
* **`realtimeRuntime`** *(int64, OPTIONAL)* - specifies a period of time in microseconds for the longest continuous period in which the tasks in a cgroup have access to CPU resources
Sep 18, 2016
Sep 18, 2016
303
* **`realtimePeriod`** *(uint64, OPTIONAL)* - same as **`period`** but applies to realtime scheduler only
304
* **`cpus`** *(string, OPTIONAL)* - list of CPUs the container will run in
305
* **`mems`** *(string, OPTIONAL)* - list of Memory Nodes the container will run in
Jan 27, 2016
Jan 27, 2016
306
May 16, 2017
May 16, 2017
307
#### Example
Jan 27, 2016
Jan 27, 2016
308
309
```json
310
"cpu": {
Apr 11, 2016
Apr 11, 2016
311
"shares": 1024,
312
"quota": 1000000,
313
"period": 500000,
314
"realtimeRuntime": 950000,
315
"realtimePeriod": 1000000,
316
"cpus": "2-3",
317
"mems": "0-7"
Jan 27, 2016
Jan 27, 2016
318
}
319
```
320
May 16, 2017
May 16, 2017
321
### <a name="configLinuxBlockIO" />Block IO
Jan 27, 2016
Jan 27, 2016
322
Oct 27, 2016
Oct 27, 2016
323
**`blockIO`** (object, OPTIONAL) represents the cgroup subsystem `blkio` which implements the block IO controller.
May 15, 2017
May 15, 2017
324
For more information, see the kernel cgroups documentation about [blkio][cgroup-v1-blkio].
Jan 27, 2016
Jan 27, 2016
325
May 14, 2017
May 14, 2017
326
The following parameters can be specified to set up the controller:
Jan 27, 2016
Jan 27, 2016
327
Jun 1, 2017
Jun 1, 2017
328
* **`weight`** *(uint16, OPTIONAL)* - specifies per-cgroup weight. This is default weight of the group on all devices until and unless overridden by per-device rules.
329
* **`leafWeight`** *(uint16, OPTIONAL)* - equivalents of `weight` for the purpose of deciding how much weight tasks in the given cgroup has while competing with the cgroup's child cgroups.
330
* **`weightDevice`** *(array of objects, OPTIONAL)* - specifies the list of devices which will be bandwidth rate limited. The following parameters can be specified per-device:
May 14, 2017
May 14, 2017
331
* **`major, minor`** *(int64, REQUIRED)* - major, minor numbers for device. More info in [mknod(1)][mknod.1] man page.
Apr 26, 2017
Apr 26, 2017
332
* **`weight`** *(uint16, OPTIONAL)* - bandwidth rate for the device.
333
* **`leafWeight`** *(uint16, OPTIONAL)* - bandwidth rate for the device while competing with the cgroup's child cgroups, CFQ scheduler only
Jan 27, 2016
Jan 27, 2016
334
Mar 30, 2017
Mar 30, 2017
335
You MUST specify at least one of `weight` or `leafWeight` in a given entry, and MAY specify both.
Jan 27, 2016
Jan 27, 2016
336
Jun 1, 2017
Jun 1, 2017
337
* **`throttleReadBpsDevice`**, **`throttleWriteBpsDevice`**, **`throttleReadIOPSDevice`**, **`throttleWriteIOPSDevice`** *(array of objects, OPTIONAL)* - specify the list of devices which will be IO rate limited.
Jun 1, 2017
Jun 1, 2017
338
The following parameters can be specified per-device:
May 14, 2017
May 14, 2017
339
* **`major, minor`** *(int64, REQUIRED)* - major, minor numbers for device. More info in [mknod(1)][mknod.1] man page.
Sep 18, 2016
Sep 18, 2016
340
* **`rate`** *(uint64, REQUIRED)* - IO rate limit for the device
Jan 27, 2016
Jan 27, 2016
341
May 16, 2017
May 16, 2017
342
#### Example
Jan 27, 2016
Jan 27, 2016
343
344
```json
345
"blockIO": {
Jun 1, 2017
Jun 1, 2017
346
"weight": 10,
347
"leafWeight": 10,
348
"weightDevice": [
Jan 27, 2016
Jan 27, 2016
349
{
350
"major": 8,
351
"minor": 0,
352
"weight": 500,
353
"leafWeight": 300
354
},
355
{
356
"major": 8,
357
"minor": 16,
358
"weight": 500
359
}
360
],
Jun 1, 2017
Jun 1, 2017
361
"throttleReadBpsDevice": [
Jan 27, 2016
Jan 27, 2016
362
{
363
"major": 8,
364
"minor": 0,
365
"rate": 600
366
}
367
],
Jun 1, 2017
Jun 1, 2017
368
"throttleWriteIOPSDevice": [
Jan 27, 2016
Jan 27, 2016
369
{
370
"major": 8,
371
"minor": 16,
372
"rate": 300
373
}
374
]
375
}
376
```
377
May 16, 2017
May 16, 2017
378
### <a name="configLinuxHugePageLimits" />Huge page limits
Jan 27, 2016
Jan 27, 2016
379
Sep 30, 2016
Sep 30, 2016
380
**`hugepageLimits`** (array of objects, OPTIONAL) represents the `hugetlb` controller which allows to limit the
Jan 27, 2016
Jan 27, 2016
381
HugeTLB usage per control group and enforces the controller limit during page fault.
May 15, 2017
May 15, 2017
382
For more information, see the kernel cgroups documentation about [HugeTLB][cgroup-v1-hugetlb].
Jan 27, 2016
Jan 27, 2016
383
Sep 30, 2016
Sep 30, 2016
384
Each entry has the following structure:
Jan 27, 2016
Jan 27, 2016
385
Sep 18, 2016
Sep 18, 2016
386
* **`pageSize`** *(string, REQUIRED)* - hugepage size
Mar 1, 2017
Mar 1, 2017
387
* **`limit`** *(uint64, REQUIRED)* - limit in bytes of *hugepagesize* HugeTLB usage
Jan 27, 2016
Jan 27, 2016
388
May 16, 2017
May 16, 2017
389
#### Example
Jan 27, 2016
Jan 27, 2016
390
391
```json
Jun 1, 2017
Jun 1, 2017
392
"hugepageLimits": [
Jan 27, 2016
Jan 27, 2016
393
{
394
"pageSize": "2MB",
Jan 5, 2017
Jan 5, 2017
395
"limit": 209715200
Jan 27, 2016
Jan 27, 2016
396
}
397
]
398
```
399
May 16, 2017
May 16, 2017
400
### <a name="configLinuxNetwork" />Network
Jan 27, 2016
Jan 27, 2016
401
Sep 30, 2016
Sep 30, 2016
402
**`network`** (object, OPTIONAL) represents the cgroup subsystems `net_cls` and `net_prio`.
May 15, 2017
May 15, 2017
403
For more information, see the kernel cgroups documentations about [net\_cls cgroup][cgroup-v1-net-cls] and [net\_prio cgroup][cgroup-v1-net-prio].
Jan 27, 2016
Jan 27, 2016
404
May 14, 2017
May 14, 2017
405
The following parameters can be specified to set up the controller:
Jan 27, 2016
Jan 27, 2016
406
Sep 18, 2016
Sep 18, 2016
407
* **`classID`** *(uint32, OPTIONAL)* - is the network class identifier the cgroup's network packets will be tagged with
Mar 29, 2017
Mar 29, 2017
408
* **`priorities`** *(array of objects, OPTIONAL)* - specifies a list of objects of the priorities assigned to traffic originating from processes in the group and egressing the system on various interfaces.
Jun 1, 2017
Jun 1, 2017
409
The following parameters can be specified per-priority:
Mar 7, 2017
Mar 7, 2017
410
* **`name`** *(string, REQUIRED)* - interface name in [runtime network namespace](glossary.md#runtime-namespace)
Sep 18, 2016
Sep 18, 2016
411
* **`priority`** *(uint32, REQUIRED)* - priority applied to the interface
Jan 27, 2016
Jan 27, 2016
412
May 16, 2017
May 16, 2017
413
#### Example
Jan 27, 2016
Jan 27, 2016
414
415
```json
Jun 1, 2017
Jun 1, 2017
416
"network": {
Jan 27, 2016
Jan 27, 2016
417
"classID": 1048577,
418
"priorities": [
419
{
420
"name": "eth0",
421
"priority": 500
422
},
423
{
424
"name": "eth1",
425
"priority": 1000
426
}
427
]
428
}
429
```
430
May 16, 2017
May 16, 2017
431
### <a name="configLinuxPIDS" />PIDs
Jan 27, 2016
Jan 27, 2016
432
Sep 30, 2016
Sep 30, 2016
433
**`pids`** (object, OPTIONAL) represents the cgroup subsystem `pids`.
May 15, 2017
May 15, 2017
434
For more information, see the kernel cgroups documentation about [pids][cgroup-v1-pids].
Jan 27, 2016
Jan 27, 2016
435
May 14, 2017
May 14, 2017
436
The following parameters can be specified to set up the controller:
Jan 27, 2016
Jan 27, 2016
437
Sep 18, 2016
Sep 18, 2016
438
* **`limit`** *(int64, REQUIRED)* - specifies the maximum number of tasks in the cgroup
Jan 27, 2016
Jan 27, 2016
439
May 16, 2017
May 16, 2017
440
#### Example
Jan 27, 2016
Jan 27, 2016
441
442
```json
Jun 1, 2017
Jun 1, 2017
443
"pids": {
Jan 27, 2016
Jan 27, 2016
444
"limit": 32771
445
}
446
```
447
Mar 10, 2017
Mar 10, 2017
448
## <a name="configLinuxIntelRdt" />IntelRdt
449
May 9, 2017
May 9, 2017
450
**`intelRdt`** (object, OPTIONAL) represents the [Intel Resource Director Technology][intel-rdt-cat-kernel-interface].
451
If `intelRdt` is set, the runtime MUST write the container process ID to the `<container-id>/tasks` file in a mounted `resctrl` pseudo-filesystem, using the container ID from [`start`](runtime.md#start) and creating the `<container-id>` directory if necessary.
452
If no mounted `resctrl` pseudo-filesystem is available in the [runtime mount namespace](glossary.md#runtime-namespace), the runtime MUST [generate an error](runtime.md#errors).
Mar 10, 2017
Mar 10, 2017
453
May 9, 2017
May 9, 2017
454
If `intelRdt` is not set, the runtime MUST NOT manipulate any `resctrl` psuedo-filesystems.
Mar 10, 2017
Mar 10, 2017
455
May 9, 2017
May 9, 2017
456
The following parameters can be specified for the container:
Mar 10, 2017
Mar 10, 2017
457
May 9, 2017
May 9, 2017
458
* **`l3CacheSchema`** *(string, OPTIONAL)* - specifies the schema for L3 cache id and capacity bitmask (CBM).
459
If `l3CacheSchema` is set, runtimes MUST write the value to the `schemata` file in the `<container-id>` directory discussed in `intelRdt`.
Mar 10, 2017
Mar 10, 2017
460
May 9, 2017
May 9, 2017
461
If `l3CacheSchema` is not set, runtimes MUST NOT write to `schemata` files in any `resctrl` psuedo-filesystems.
Mar 10, 2017
Mar 10, 2017
462
May 9, 2017
May 9, 2017
463
### Example
Mar 10, 2017
Mar 10, 2017
464
May 9, 2017
May 9, 2017
465
Consider a two-socket machine with two L3 caches where the default CBM is 0xfffff and the max CBM length is 20 bits.
466
Tasks inside the container only have access to the "upper" 80% of L3 cache id 0 and the "lower" 50% L3 cache id 1:
Mar 10, 2017
Mar 10, 2017
467
468
```json
469
"linux": {
May 9, 2017
May 9, 2017
470
"intelRdt": {
471
"l3CacheSchema": "L3:0=ffff0;1=3ff"
472
}
Mar 10, 2017
Mar 10, 2017
473
}
474
```
475
Mar 2, 2017
Mar 2, 2017
476
## <a name="configLinuxSysctl" />Sysctl
Jan 27, 2016
Jan 27, 2016
477
Sep 30, 2016
Sep 30, 2016
478
**`sysctl`** (object, OPTIONAL) allows kernel parameters to be modified at runtime for the container.
Mar 3, 2017
Mar 3, 2017
479
For more information, see the [sysctl(8)][sysctl.8] man page.
Jan 27, 2016
Jan 27, 2016
480
May 16, 2017
May 16, 2017
481
### Example
Jan 27, 2016
Jan 27, 2016
482
483
```json
Jun 1, 2017
Jun 1, 2017
484
"sysctl": {
Jan 27, 2016
Jan 27, 2016
485
"net.ipv4.ip_forward": "1",
486
"net.core.somaxconn": "256"
487
}
488
```
489
Mar 2, 2017
Mar 2, 2017
490
## <a name="configLinuxSeccomp" />Seccomp
Jan 27, 2016
Jan 27, 2016
491
492
Seccomp provides application sandboxing mechanism in the Linux kernel.
493
Seccomp configuration allows one to configure actions to take for matched syscalls and furthermore also allows matching on values passed as arguments to syscalls.
Mar 3, 2017
Mar 3, 2017
494
For more information about Seccomp, see [Seccomp][seccomp] kernel documentation.
May 15, 2017
May 15, 2017
495
The actions, architectures, and operators are strings that match the definitions in seccomp.h from [libseccomp][] and are translated to corresponding values.
Mar 20, 2017
Mar 20, 2017
496
497
**`seccomp`** (object, OPTIONAL)
498
May 14, 2017
May 14, 2017
499
The following parameters can be specified to set up seccomp:
Mar 20, 2017
Mar 20, 2017
500
501
* **`defaultAction`** *(string, REQUIRED)* - the default action for seccomp. Allowed values are the same as `syscalls[].action`.
502
503
* **`architectures`** *(array of strings, OPTIONAL)* - the architecture used for system calls.
504
A valid list of constants as of libseccomp v2.3.2 is shown below.
505
506
* `SCMP_ARCH_X86`
507
* `SCMP_ARCH_X86_64`
508
* `SCMP_ARCH_X32`
509
* `SCMP_ARCH_ARM`
510
* `SCMP_ARCH_AARCH64`
511
* `SCMP_ARCH_MIPS`
512
* `SCMP_ARCH_MIPS64`
513
* `SCMP_ARCH_MIPS64N32`
514
* `SCMP_ARCH_MIPSEL`
515
* `SCMP_ARCH_MIPSEL64`
516
* `SCMP_ARCH_MIPSEL64N32`
517
* `SCMP_ARCH_PPC`
518
* `SCMP_ARCH_PPC64`
519
* `SCMP_ARCH_PPC64LE`
520
* `SCMP_ARCH_S390`
521
* `SCMP_ARCH_S390X`
522
* `SCMP_ARCH_PARISC`
523
* `SCMP_ARCH_PARISC64`
524
Apr 25, 2017
Apr 25, 2017
525
* **`syscalls`** *(array of objects, OPTIONAL)* - match a syscall in seccomp.
526
527
While this property is OPTIONAL, some values of `defaultAction` are not useful without `syscalls` entries.
528
For example, if `defaultAction` is `SCMP_ACT_KILL` and `syscalls` is empty or unset, the kernel will kill the container process on its first syscall.
Mar 20, 2017
Mar 20, 2017
529
530
Each entry has the following structure:
531
532
* **`names`** *(array of strings, REQUIRED)* - the names of the syscalls.
Apr 12, 2017
Apr 12, 2017
533
`names` MUST contain at least one entry.
Mar 20, 2017
Mar 20, 2017
534
* **`action`** *(string, REQUIRED)* - the action for seccomp rules.
535
A valid list of constants as of libseccomp v2.3.2 is shown below.
536
537
* `SCMP_ACT_KILL`
538
* `SCMP_ACT_TRAP`
539
* `SCMP_ACT_ERRNO`
540
* `SCMP_ACT_TRACE`
541
* `SCMP_ACT_ALLOW`
542
543
* **`args`** *(array of objects, OPTIONAL)* - the specific syscall in seccomp.
544
545
Each entry has the following structure:
546
547
* **`index`** *(uint, REQUIRED)* - the index for syscall arguments in seccomp.
548
* **`value`** *(uint64, REQUIRED)* - the value for syscall arguments in seccomp.
Jun 27, 2017
Jun 27, 2017
549
* **`valueTwo`** *(uint64, OPTIONAL)* - the value for syscall arguments in seccomp.
Mar 20, 2017
Mar 20, 2017
550
* **`op`** *(string, REQUIRED)* - the operator for syscall arguments in seccomp.
551
A valid list of constants as of libseccomp v2.3.2 is shown below.
552
553
* `SCMP_CMP_NE`
554
* `SCMP_CMP_LT`
555
* `SCMP_CMP_LE`
556
* `SCMP_CMP_EQ`
557
* `SCMP_CMP_GE`
558
* `SCMP_CMP_GT`
559
* `SCMP_CMP_MASKED_EQ`
Jan 27, 2016
Jan 27, 2016
560
May 16, 2017
May 16, 2017
561
### Example
Jan 27, 2016
Jan 27, 2016
562
563
```json
Jun 1, 2017
Jun 1, 2017
564
"seccomp": {
565
"defaultAction": "SCMP_ACT_ALLOW",
566
"architectures": [
567
"SCMP_ARCH_X86",
568
"SCMP_ARCH_X32"
569
],
570
"syscalls": [
571
{
572
"names": [
573
"getcwd",
574
"chmod"
575
],
576
"action": "SCMP_ACT_ERRNO"
577
}
578
]
579
}
Jan 27, 2016
Jan 27, 2016
580
```
581
Mar 2, 2017
Mar 2, 2017
582
## <a name="configLinuxRootfsMountPropagation" />Rootfs Mount Propagation
Jan 27, 2016
Jan 27, 2016
583
Sep 30, 2016
Sep 30, 2016
584
**`rootfsPropagation`** (string, OPTIONAL) sets the rootfs's mount propagation.
Jun 1, 2017
Jun 1, 2017
585
Its value is either slave, private, shared or unbindable.
586
The [Shared Subtrees][sharedsubtree] article in the kernel documentation has more information about mount propagation.
Jan 27, 2016
Jan 27, 2016
587
May 16, 2017
May 16, 2017
588
### Example
Jan 27, 2016
Jan 27, 2016
589
590
```json
591
"rootfsPropagation": "slave",
592
```
593
Mar 2, 2017
Mar 2, 2017
594
## <a name="configLinuxMaskedPaths" />Masked Paths
Apr 1, 2016
Apr 1, 2016
595
Sep 30, 2016
Sep 30, 2016
596
**`maskedPaths`** (array of strings, OPTIONAL) will mask over the provided paths inside the container so that they cannot be read.
Jun 1, 2017
Jun 1, 2017
597
The values MUST be absolute paths in the [container namespace](glossary.md#container_namespace).
Apr 1, 2016
Apr 1, 2016
598
May 16, 2017
May 16, 2017
599
### Example
Apr 1, 2016
Apr 1, 2016
600
601
```json
602
"maskedPaths": [
603
"/proc/kcore"
604
]
605
```
606
Mar 2, 2017
Mar 2, 2017
607
## <a name="configLinuxReadonlyPaths" />Readonly Paths
Apr 1, 2016
Apr 1, 2016
608
Sep 30, 2016
Sep 30, 2016
609
**`readonlyPaths`** (array of strings, OPTIONAL) will set the provided paths as readonly inside the container.
Jun 1, 2017
Jun 1, 2017
610
The values MUST be absolute paths in the [container namespace](glossary.md#container-namespace).
Apr 1, 2016
Apr 1, 2016
611
May 16, 2017
May 16, 2017
612
### Example
Apr 1, 2016
Apr 1, 2016
613
614
```json
615
"readonlyPaths": [
616
"/proc/sys"
617
]
618
```
619
Mar 24, 2017
Mar 24, 2017
620
## <a name="configLinuxMountLabel" />Mount Label
Apr 22, 2016
Apr 22, 2016
621
Sep 30, 2016
Sep 30, 2016
622
**`mountLabel`** (string, OPTIONAL) will set the Selinux context for the mounts in the container.
Apr 22, 2016
Apr 22, 2016
623
May 16, 2017
May 16, 2017
624
### Example
Apr 22, 2016
Apr 22, 2016
625
626
```json
627
"mountLabel": "system_u:object_r:svirt_sandbox_file_t:s0:c715,c811"
628
```
629
Mar 3, 2017
Mar 3, 2017
630
Jan 28, 2016
Jan 28, 2016
631
[cgroup-v1]: https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
632
[cgroup-v1-blkio]: https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt
633
[cgroup-v1-cpusets]: https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt
634
[cgroup-v1-devices]: https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt
635
[cgroup-v1-hugetlb]: https://www.kernel.org/doc/Documentation/cgroup-v1/hugetlb.txt
636
[cgroup-v1-memory]: https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
637
[cgroup-v1-net-cls]: https://www.kernel.org/doc/Documentation/cgroup-v1/net_cls.txt
638
[cgroup-v1-net-prio]: https://www.kernel.org/doc/Documentation/cgroup-v1/net_prio.txt
639
[cgroup-v1-pids]: https://www.kernel.org/doc/Documentation/cgroup-v1/pids.txt
640
[cgroup-v2]: https://www.kernel.org/doc/Documentation/cgroup-v2.txt
Apr 20, 2017
Apr 20, 2017
641
[devices]: https://www.kernel.org/doc/Documentation/admin-guide/devices.txt
Jan 27, 2016
Jan 27, 2016
642
[devpts]: https://www.kernel.org/doc/Documentation/filesystems/devpts.txt
Mar 3, 2017
Mar 3, 2017
643
[file]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap03.html#tag_03_164
644
[libseccomp]: https://github.com/seccomp/libseccomp
645
[procfs]: https://www.kernel.org/doc/Documentation/filesystems/proc.txt
646
[seccomp]: https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt
647
[sharedsubtree]: https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt
648
[sysfs]: https://www.kernel.org/doc/Documentation/filesystems/sysfs.txt
649
[tmpfs]: https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt
Jan 27, 2016
Jan 27, 2016
650
651
[console.4]: http://man7.org/linux/man-pages/man4/console.4.html
652
[full.4]: http://man7.org/linux/man-pages/man4/full.4.html
Mar 3, 2017
Mar 3, 2017
653
[mknod.1]: http://man7.org/linux/man-pages/man1/mknod.1.html
654
[mknod.2]: http://man7.org/linux/man-pages/man2/mknod.2.html
655
[namespaces.7_2]: http://man7.org/linux/man-pages/man7/namespaces.7.html
Jan 27, 2016
Jan 27, 2016
656
[null.4]: http://man7.org/linux/man-pages/man4/null.4.html
657
[pts.4]: http://man7.org/linux/man-pages/man4/pts.4.html
658
[random.4]: http://man7.org/linux/man-pages/man4/random.4.html
Mar 3, 2017
Mar 3, 2017
659
[sysctl.8]: http://man7.org/linux/man-pages/man8/sysctl.8.html
Jan 27, 2016
Jan 27, 2016
660
[tty.4]: http://man7.org/linux/man-pages/man4/tty.4.html
661
[zero.4]: http://man7.org/linux/man-pages/man4/zero.4.html
Feb 22, 2017
Feb 22, 2017
662
[user-namespaces]: http://man7.org/linux/man-pages/man7/user_namespaces.7.html
Mar 10, 2017
Mar 10, 2017
663
[intel-rdt-cat-kernel-interface]: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt