virtcontainers: hotplug memory with kata-runtime update command. by cedriccchen · Pull Request #624 · kata-containers/runtime

cedriccchen · 2018-08-23T08:54:45Z

Add support for using update command to hotplug memory to vm.
Connect kata-runtime update interface with hypervisor memory hotplug
feature.

Fixes #625

Signed-off-by: Clare Chen clare.chenhui@huawei.com

opendev-zuul · 2018-08-23T09:44:25Z

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

kata-runsh : NODE_FAILURE

jodh-intel · 2018-08-23T09:55:43Z

virtcontainers/container.go

+	oldMem := oldResources.Mem
+	newMem := newResources.Mem
+
+	if newMem == 0 || oldMem == newMem {


Wouldn't newMem == 0 be an error?

It wouldn't be en error. The newMem comes from resources.Memory.Limit. When we use update command to update vcpu without update memory, for example "docker update --cpus 5 [container-id]", the value newMem will be zero, which just means the user don't want to update memory.

Ah - right. I find combining the handling of two different resources rather confusing tbh ;)

jodh-intel · 2018-08-23T09:57:05Z

virtcontainers/container.go

+	if oldMem < newMem {
+		// hot add memory
+		addMemMB := newMem - oldMem
+		// hot add memory must be aligned to 128MB


Can you point to a reference on why this is? Also, there are three occurrences of 128 here - could you define a variable and use that to create the error in fmt.Errorf() below?

Also, it might be a good idea to create a function like the following to encapsulate the memory checking. That would mean you could also create a unit test for it to:

func memChangeValid(oldMem, newMem uint32) bool { // Handle the 128MB checks, etc. }

this is a limitation in the kernel https://elixir.bootlin.com/linux/v4.18.4/source/include/linux/mmzone.h#L1082 (PAGE_SECTION_MASK), we can deal with this limitation but we need support for memory align in QEMU, I think those patches haven't been merged yet (not sure).

Yes, I don't find any limitation of memory section align in QEMU, so I do judgment here. I will define a variable of 128 and create a function to encapsulate the memory checking, thanks.

@devimc @jodh-intel The reason why hot add memory must be aligned to 128MB is that guestos will check whether hotplug memory range is aligned with memory section when we hot add memory to the guest, with reference to https://elixir.bootlin.com/linux/v4.18.4/source/mm/memory_hotplug.c#L1075.
On x86_64 the size of section is 128MB, but on other platform section may not be 128MB. If we want to keep compatible with other platform, we need to read the section size of guest rather than use 128MB.

@clarecch @jodh-intel : You are right. For ppc64le, the hot plugged memory must be aligned to 256MB.

@nitkon @devimc @jodh-intel We have some discuss about hot add memory align in #580

@clarecch - rather than kata-containers/agent#354, is there are reason that we can't just update the agent to align to the correct boundary when it's given a memory adjustment by the runtime? That way we don't need a new gRPC call and all the logic is encapsulated in the agent.

@jodh-intel The qmp command that hotplug memory needs a memory size aligned to correct boundary, so runtime should know the memory block, which gets from the agent.

Thanks @clarecch - that was the missing piece of info I needed :)

jodh-intel · 2018-08-23T09:59:52Z

virtcontainers/container.go

+		}
+	} else {
+		// hot remove memory
+		return fmt.Errorf("can't hot remove memory, we don't support this feature yet")


I'd simplify the message to something like:

return errors.New("memory hot remove not supported")

it's really hard (almost impossible) to hot remove memory 😄

Indeed - and if possible, probably rather expensive (migrating off anything live to another dimm etc...)

We do have memory hotplug on Z 😄 How about this for the error message : Memory hotplug unsupported feels much more concise.

Do you mean Memory hot remove unsupported ? I think it would be a good choice.

No I think hotplug would be the better choice here. I say this because, almost all of IBM Z docs use plug/unplug and even ovirt calls it hotplug. Looks like Vmware is the only one calling it hotadd. But I am okay with hot add if the community agrees.

@ydjainopensource - I think @clarecch was referring more to the fact that the error should mention "remove" rather than "plug". If it has to include plug, it could be worded like:

Removal of hotplug memory not supported

jodh-intel · 2018-08-23T10:00:29Z

virtcontainers/qemu.go

 			memDev.sizeMB, currentMemory, q.config.DefaultMemSz)
 	}

+	//memory slot 0 is occupied


I don't understand this comment - can you provide more detail?

When we use a qmp command to update memory, such as "object_add ... id=mem0 ...", there will be an qmp error desc of "attempt to add duplicate property 'mem0' to object (type 'container')". I think the "mem0" has been used, so when we hotplug memory, we should start with mem1

Is mem0 maybe either the nvdimm or an initial fixed block that can never be plug/unplugged - @devimc , I suspect you may remember?

yep, mem0 is occupied by other components like nvdimm or vmfactory share memory

Great - @clarecch , can you add that to the comment please then :-) thx!

@bergwolf I am using QemuState.Slots, which is saved in hypervisor.json to store the slot id if it has changed now. I think if we have use any mempry slot we should update QemuState.Slots, and then store it to hypervisor.json. I will take a look at components which will occupy the mem0 slot.

@devimc Yes, I also think it's needed to add an option in the configuration file to make the number of slots configurable. But maybe create a new issue to do this as well as check if the maximum number of memory slots impacts container memory footprint?

@devimc @bergwolf How about use qmp command to track slot number at present, and then initialize the QemuState.Slots field before hotplug memory?
Just like this.
(QEMU) query-memory-devices
{
"return": [
{
"data": {
"slot": 0,
"node": 0,
"addr": 4294967296,
"memdev": "/objects/mem0",
"id": "nv0",
"hotpluggable": true,
"hotplugged": false,
"size": 536870912
},
"type": "dimm"
}
]
}

@clarecch sounds good, in that way we can see what slots are available

@clarecch Yes, indeed. We can even remove the slot assignment from kata. We only need to make sure every time we use different memory device id to hotplug it. QEMU always plugs a new dimm in the first available slot. We either query qemu for available slots, or rely on qmp hotplug failure to detect unavailable dimm slots.

katacontainersbot · 2018-08-23T10:28:52Z

PSS Measurement:
Qemu: 167843 KB
Proxy: 4011 KB
Shim: 8830 KB

Memory inside container:
Total Memory: 2043464 KB
Free Memory: 2003440 KB

devimc · 2018-08-23T14:12:11Z

virtcontainers/container.go


 	newResources := ContainerResources{
 		VCPUs: uint32(utils.ConstraintsToVCPUs(*resources.CPU.Quota, *resources.CPU.Period)),
+		Mem:   uint32(*resources.Memory.Limit >> 20),


please add a comment explaining this

Please tell me what do I need to explain. Why we need Mem field or why we have 20 right shift? ^^

It is a little unclear as resources.Memory.Limit is an int64 byte value. It would help if ContainerResources.Mem was called ContainerResources.MemMb imho.

Yes, you are rignt. I will add the MB suffix to ContainerResources.Mem, as well as the oldMem and newMem , which are not graceful.

@clarecch

why we have 20 right shift?

@devimc I was a little confused to find ContainerResources.Mem is uint32 type, so I use it in unit of MB, that is the reason why I have 20 ritht shift. I will add the MB suffix to ContainerResources.Mem besides the oldMem and newMem , which will make it easier to understand.

/cc @miaoyq regarding the int64 / int32 Mem size issue.

ghost · 2018-08-27T06:53:23Z

virtcontainers/container.go

+		addMemMB := newMem - oldMem
+		// hot add memory must be aligned to 128MB
+		if addMemMB%128 != 0 {
+			return fmt.Errorf("can not hot add memory which isn't aligned to 128MB")


Can you change the error message to something like Memory must be aligned to 128 MB for hotplug ? or hot add ?

ok， Updated

jodh-intel · 2018-08-29T15:07:46Z

Annoyingly, the failed CI runs on this PR have somehow "expired" / been deleted, but of course we'll get new results on a re-push once comments have been addressed ;)

bergwolf · 2018-09-06T06:03:34Z

@clarecch Could you proceed this w/o the alignment query from guest? It can be fixed in a follow up PR which would depend on querying memory alignment from the guest. That will speed up the development of this feature.

cedriccchen · 2018-09-06T06:44:09Z

@bergwolf

Could you proceed this w/o the alignment query from guest

Do you mean the memory section alignment? I have open a issue for that, which is going to implement querying memory block size from the agent kata-containers/agent#354. Is it exactly what you mean?

bergwolf · 2018-09-06T08:19:01Z

@clarecch Yes, that's the issue I wanted to mention. And I was asking not to depend this PR on that one. So you can add most memory online functionality and fix up the alignment requirement in a follow up PR.

katacontainersbot · 2018-09-06T10:18:12Z

PSS Measurement:
Qemu: 169628 KB
Proxy: 4101 KB
Shim: 8772 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2002696 KB

opendev-zuul · 2018-09-06T10:20:49Z

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

kata-runsh : FAILURE in 34m 50s

katacontainersbot · 2018-09-06T12:07:38Z

PSS Measurement:
Qemu: 172089 KB
Proxy: 4141 KB
Shim: 8769 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2003572 KB

opendev-zuul · 2018-09-06T12:26:37Z

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

kata-runsh : FAILURE in 29m 48s

katacontainersbot · 2018-09-06T12:59:31Z

PSS Measurement:
Qemu: 164065 KB
Proxy: 4086 KB
Shim: 8809 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2003564 KB

opendev-zuul · 2018-09-06T13:20:03Z

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

kata-runsh : FAILURE in 36m 34s

cedriccchen · 2018-09-06T14:16:16Z

The jenkins-ci-centos-7-4 failed with:

Summarizing 3 Failures:

[Fail] CPUs and CPU set updating cpus and cpuset of a running container [It] should have the right number of vCPUs 
/tmp/jenkins/workspace/kata-containers-runtime-centos-7-4-PR/go/src/github.com/kata-containers/tests/integration/docker/cpu_test.go:432

[Fail] memory constraints run container and update its memory constraints [It] should have applied the constraints 
/tmp/jenkins/workspace/kata-containers-runtime-centos-7-4-PR/go/src/github.com/kata-containers/tests/integration/docker/mem_test.go:157

[Fail] docker volume remove bind-mount source before container exits [It] should exit cleanly without leaking process 
/tmp/jenkins/workspace/kata-containers-runtime-centos-7-4-PR/go/src/github.com/kata-containers/tests/integration/docker/volume_test.go:175

I know the memory constraints failed is because of memory hot remove. But I can't figure out why the first and the third testcase failed. Pls tell me!

devimc · 2018-09-06T14:51:15Z

@clarecch

Stderr: Error response from daemon: Cannot update container 2b4090f7c82e239c210f8d8efd5b7f0615386459f934276a53e726e4e66f3114: /usr/local/bin/kata-runtime did not terminate sucessfully: rpc error: code = Unknown desc = failed to write 0-1 to cpuset.cpus: write /sys/fs/cgroup/cpuset/docker/2b4090f7c82e239c210f8d8efd5b7f0615386459f934276a53e726e4e66f3114/cpuset.cpus: invalid argument

opendev-zuul · 2018-09-07T04:07:04Z

Build failed (third-party-check pipeline) integration testing with
OpenStack. For information on how to proceed, see
http://docs.openstack.org/infra/manual/developers.html#automated-testing

kata-runsh : FAILURE in 33m 48s

katacontainersbot · 2018-09-07T04:47:42Z

PSS Measurement:
Qemu: 165542 KB
Proxy: 4162 KB
Shim: 8693 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2003448 KB

linzichang · 2018-09-07T07:39:41Z

@devimc @jodh-intel @bergwolf

Stderr: Error response from daemon: Cannot update container 2b4090f7c82e239c210f8d8efd5b7f0615386459f934276a53e726e4e66f3114: /usr/local/bin/kata-runtime did not terminate sucessfully: rpc error: code = Unknown desc = failed to write 0-1 to cpuset.cpus: write /sys/fs/cgroup/cpuset/docker/2b4090f7c82e239c210f8d8efd5b7f0615386459f934276a53e726e4e66f3114/cpuset.cpus: invalid argument

This failed because now we have udev and OnlineCPUMem RPC both working.

udev has some probability to make some of CPUs online. Making onlineCPUResources return error. See https://github.com/kata-containers/agent/blob/master/grpc.go#L145
Then onlineCPUMem will return. See https://github.com/kata-containers/agent/blob/master/grpc.go#L206
And dont't update the parent cgroup cpuset. See https://github.com/kata-containers/agent/blob/master/grpc.go#L235

Update:
I reproduce this error on my local machine. After I remove the rules in /lib/udev/rules.d/40-redhat.rules, my local test like docker update --cpus 3 --cpuset-cpus 1,2 1bfff24501dc is OK.

I have a confusion too, why need the code below in agent's UpdateContainer. Isn't the cpuset will be set in c.container.Set(config)?

// cpuset is a special case where container's cpuset cgroup MUST BE updated
if resources.CpusetCpus != "" {
        cookies := make(cookie)
        if err = updateContainerCpuset(contConfig.Cgroups.Path, resources.CpusetCpus, cookies); err != nil {
                agentLog.WithError(err).Warn("Could not update container cpuset cgroup")
        }
}

And this updateContainerCpuset fail in some case because it try to update /sys/fs/cgroup/cpuset/docker/cpuset.cpus using container's cpuset.cpus value.

time="2018-09-07T08:37:26.05723845Z" level=warning msg="Could not update container cpuset cgroup" error="Could not update cpuset cgroup '1,2': write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: device or resource busy" name=kata-agent pid=117 sandbox=1bfff24501dc2a1d6fbbdd3456b9a53a16eb0ee00b7ea85db4f2c04dd7bd1522 source=agent

katacontainersbot · 2018-09-15T03:59:00Z

PSS Measurement:
Qemu: 169663 KB
Proxy: 4150 KB
Shim: 8861 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2006704 KB

cedriccchen · 2018-09-15T04:45:38Z

The CI is green and conflict is solved. PTAL

bergwolf

Generally looks good. Just one comment inline.

bergwolf · 2018-09-17T07:10:50Z

virtcontainers/container.go

+}
+
+func (c *Container) memHotplugValid(mem uint32) (uint32, error) {
+	state, err := c.sandbox.storage.fetchSandboxState(c.sandbox.id)


sandbox state is already saved in the sandbox structure. Please use that one if available so we can possibly avoid disk IO here.

Oh, yes you are right. I will fix it.

katacontainersbot · 2018-09-17T08:04:58Z

PSS Measurement:
Qemu: 171205 KB
Proxy: 4086 KB
Shim: 8943 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2006736 KB

katacontainersbot · 2018-09-17T08:19:33Z

PSS Measurement:
Qemu: 165632 KB
Proxy: 4082 KB
Shim: 9053 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2006512 KB

Fixes #671 agent Shortlog: 7e8e20b agent: add GetGuestDetails gRPC function 5936600 grpc: grpc.Code is deprecated 2d3b9ac release: Kata Containers 1.3.0-rc0 a6e27d6 client: fix dialer after vendor update cd03e0c vendor: update grpc-go dependency 1d559a7 channel: add serial yamux channel close timeout fcf6fa7 agent: update resources list with the right device major-minor number Signed-off-by: Zichang Lin <linzichang@huawei.com>

Add support for using update command to hotplug memory to vm. Connect kata-runtime update interface with hypervisor memory hotplug feature. Fixes #625 Signed-off-by: Clare Chen <clare.chenhui@huawei.com>

Get and store guest details after sandbox is completely created. And get memory block size from sandbox state file when check hotplug memory valid. Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>

katacontainersbot · 2018-09-17T12:55:05Z

PSS Measurement:
Qemu: 167483 KB
Proxy: 4146 KB
Shim: 9017 KB

Memory inside container:
Total Memory: 2043460 KB
Free Memory: 2006704 KB

cedriccchen · 2018-09-17T13:59:11Z

The CI is green and conflict is solved. PTAL @jodh-intel @devimc @bergwolf

jodh-intel · 2018-09-17T14:17:11Z

lgtm

devimc

lgtm

bergwolf

LGTM, thanks @clarecch! Once this is merged, could you please add a test in CI to make sure we validate it and avoid breaking the functionality in future?

cedriccchen · 2018-09-17T17:59:25Z

@bergwolf of course

caoruidong · 2018-09-18T08:28:05Z

Codecov is stuck. Merge this PR because it updates some tests. If coverage drops, @clarecch please add more tests, thank you!

As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-Off-By: Penny Zheng <penny.zheng@arm.com>

As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-off-by: Penny Zheng <penny.zheng@arm.com>

…rnel-proto-routes network: While updating routes, do not delete routes with proto "kernel"

This was referenced Aug 23, 2018

agent: add support for online memory and cpu separately kata-containers/agent#331

Merged

virtcontainers: hotplug memory with kata-runtime update command #625

Closed

jodh-intel reviewed Aug 23, 2018

View reviewed changes

devimc reviewed Aug 23, 2018

View reviewed changes

caoruidong mentioned this pull request Aug 25, 2018

versions: Bump golang from 1.8.3 to 1.9.7 #643

Merged

ghost reviewed Aug 27, 2018

View reviewed changes

egernst mentioned this pull request Aug 27, 2018

Support Memory constrains #158

Closed

4 tasks

jodh-intel mentioned this pull request Aug 28, 2018

[WIP] k8s memory resource hotplug #580

Closed

cedriccchen mentioned this pull request Aug 29, 2018

qemu/qmp: support query-memory-devices qmp command. kata-containers/govmm#44

Merged

This was referenced Aug 30, 2018

vendor: update govmm and agent to support memory update #669

Closed

Update vendor to support memory update #671

Closed

cedriccchen mentioned this pull request Sep 3, 2018

agent: get content of a file in sandbox kata-containers/agent#354

Closed

cedriccchen mentioned this pull request Sep 7, 2018

integration/docker: modify testcase of memory constraints update kata-containers/tests#723

Closed

bergwolf reviewed Sep 17, 2018

View reviewed changes

linzichang and others added 3 commits September 17, 2018 05:02

virtcontainers: hotplug memory with kata-runtime update command

13bf7d1

Add support for using update command to hotplug memory to vm. Connect kata-runtime update interface with hypervisor memory hotplug feature. Fixes #625 Signed-off-by: Clare Chen <clare.chenhui@huawei.com>

devimc approved these changes Sep 17, 2018

View reviewed changes

bergwolf approved these changes Sep 17, 2018

View reviewed changes

caoruidong merged commit 7d8ce4e into kata-containers:master Sep 18, 2018

cedriccchen deleted the update_memory branch September 18, 2018 08:39

cedriccchen mentioned this pull request Sep 20, 2018

config: Add Memory slots config #752

Merged

This was referenced Sep 26, 2018

sandbox/virtcontainers: memory resource hotplug when create container. #786

Merged

Support elastic memory hotplug #788

Closed

BetaXOi mentioned this pull request Sep 27, 2018

Support memory hotplugged removable #791

Closed

jodh-intel mentioned this pull request Sep 28, 2018

open /sys/devices/system/memory/block_size_bytes: no such file or directory #796

Closed

cedriccchen mentioned this pull request Oct 12, 2018

update: Return memory to host on memory update. #793

Closed

Pennyzct mentioned this pull request Jan 15, 2019

memory_hotplug: MEM_BOUNDARY_MB should be arch-dependent kata-containers/osbuilder#224

Closed

Pennyzct mentioned this pull request Jan 15, 2019

memory_hotplug: MEM_BOUNDARY_MB should be arch-dependent kata-containers/osbuilder#225

Merged

egernst pushed a commit to egernst/runtime that referenced this pull request Feb 9, 2021

Merge pull request kata-containers#624 from amshinde/skip-deleting-ke…

a78e8cf

…rnel-proto-routes network: While updating routes, do not delete routes with proto "kernel"

Conversation

cedriccchen commented Aug 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

opendev-zuul bot commented Aug 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cedriccchen Aug 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

katacontainersbot commented Aug 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cedriccchen Aug 24, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cedriccchen commented Aug 23, 2018 •

edited

Loading

cedriccchen Aug 24, 2018 •

edited

Loading

cedriccchen Aug 24, 2018 •

edited

Loading