virtcontainers: hotplug memory with kata-runtime update command.#624
virtcontainers: hotplug memory with kata-runtime update command.#624caoruidong merged 3 commits intokata-containers:masterfrom cedriccchen:update_memory
Conversation
|
Build failed (third-party-check pipeline) integration testing with
|
virtcontainers/container.go
Outdated
| oldMem := oldResources.Mem | ||
| newMem := newResources.Mem | ||
|
|
||
| if newMem == 0 || oldMem == newMem { |
There was a problem hiding this comment.
It wouldn't be en error. The newMem comes from resources.Memory.Limit. When we use update command to update vcpu without update memory, for example "docker update --cpus 5 [container-id]", the value newMem will be zero, which just means the user don't want to update memory.
There was a problem hiding this comment.
Ah - right. I find combining the handling of two different resources rather confusing tbh ;)
virtcontainers/container.go
Outdated
| if oldMem < newMem { | ||
| // hot add memory | ||
| addMemMB := newMem - oldMem | ||
| // hot add memory must be aligned to 128MB |
There was a problem hiding this comment.
Can you point to a reference on why this is? Also, there are three occurrences of 128 here - could you define a variable and use that to create the error in fmt.Errorf() below?
Also, it might be a good idea to create a function like the following to encapsulate the memory checking. That would mean you could also create a unit test for it to:
func memChangeValid(oldMem, newMem uint32) bool {
// Handle the 128MB checks, etc.
}
There was a problem hiding this comment.
this is a limitation in the kernel https://elixir.bootlin.com/linux/v4.18.4/source/include/linux/mmzone.h#L1082 (PAGE_SECTION_MASK), we can deal with this limitation but we need support for memory align in QEMU, I think those patches haven't been merged yet (not sure).
There was a problem hiding this comment.
Yes, I don't find any limitation of memory section align in QEMU, so I do judgment here. I will define a variable of 128 and create a function to encapsulate the memory checking, thanks.
There was a problem hiding this comment.
@devimc @jodh-intel The reason why hot add memory must be aligned to 128MB is that guestos will check whether hotplug memory range is aligned with memory section when we hot add memory to the guest, with reference to https://elixir.bootlin.com/linux/v4.18.4/source/mm/memory_hotplug.c#L1075.
On x86_64 the size of section is 128MB, but on other platform section may not be 128MB. If we want to keep compatible with other platform, we need to read the section size of guest rather than use 128MB.
There was a problem hiding this comment.
@clarecch @jodh-intel : You are right. For ppc64le, the hot plugged memory must be aligned to 256MB.
There was a problem hiding this comment.
@nitkon @devimc @jodh-intel We have some discuss about hot add memory align in #580
There was a problem hiding this comment.
@clarecch - rather than kata-containers/agent#354, is there are reason that we can't just update the agent to align to the correct boundary when it's given a memory adjustment by the runtime? That way we don't need a new gRPC call and all the logic is encapsulated in the agent.
There was a problem hiding this comment.
@jodh-intel The qmp command that hotplug memory needs a memory size aligned to correct boundary, so runtime should know the memory block, which gets from the agent.
There was a problem hiding this comment.
Thanks @clarecch - that was the missing piece of info I needed :)
virtcontainers/container.go
Outdated
| } | ||
| } else { | ||
| // hot remove memory | ||
| return fmt.Errorf("can't hot remove memory, we don't support this feature yet") |
There was a problem hiding this comment.
I'd simplify the message to something like:
return errors.New("memory hot remove not supported")
There was a problem hiding this comment.
it's really hard (almost impossible) to hot remove memory 😄
There was a problem hiding this comment.
Indeed - and if possible, probably rather expensive (migrating off anything live to another dimm etc...)
There was a problem hiding this comment.
We do have memory hotplug on Z 😄 How about this for the error message : Memory hotplug unsupported feels much more concise.
There was a problem hiding this comment.
Do you mean Memory hot remove unsupported ? I think it would be a good choice.
There was a problem hiding this comment.
No I think hotplug would be the better choice here. I say this because, almost all of IBM Z docs use plug/unplug and even ovirt calls it hotplug. Looks like Vmware is the only one calling it hotadd. But I am okay with hot add if the community agrees.
There was a problem hiding this comment.
@ydjainopensource - I think @clarecch was referring more to the fact that the error should mention "remove" rather than "plug". If it has to include plug, it could be worded like:
Removal of hotplug memory not supported
virtcontainers/qemu.go
Outdated
| memDev.sizeMB, currentMemory, q.config.DefaultMemSz) | ||
| } | ||
|
|
||
| //memory slot 0 is occupied |
There was a problem hiding this comment.
I don't understand this comment - can you provide more detail?
There was a problem hiding this comment.
When we use a qmp command to update memory, such as "object_add ... id=mem0 ...", there will be an qmp error desc of "attempt to add duplicate property 'mem0' to object (type 'container')". I think the "mem0" has been used, so when we hotplug memory, we should start with mem1
There was a problem hiding this comment.
Is mem0 maybe either the nvdimm or an initial fixed block that can never be plug/unplugged - @devimc , I suspect you may remember?
There was a problem hiding this comment.
yep, mem0 is occupied by other components like nvdimm or vmfactory share memory
There was a problem hiding this comment.
Great - @clarecch , can you add that to the comment please then :-) thx!
There was a problem hiding this comment.
@bergwolf I am using QemuState.Slots, which is saved in hypervisor.json to store the slot id if it has changed now. I think if we have use any mempry slot we should update QemuState.Slots, and then store it to hypervisor.json. I will take a look at components which will occupy the mem0 slot.
There was a problem hiding this comment.
@devimc Yes, I also think it's needed to add an option in the configuration file to make the number of slots configurable. But maybe create a new issue to do this as well as check if the maximum number of memory slots impacts container memory footprint?
There was a problem hiding this comment.
@devimc @bergwolf How about use qmp command to track slot number at present, and then initialize the QemuState.Slots field before hotplug memory?
Just like this.
(QEMU) query-memory-devices
{
"return": [
{
"data": {
"slot": 0,
"node": 0,
"addr": 4294967296,
"memdev": "/objects/mem0",
"id": "nv0",
"hotpluggable": true,
"hotplugged": false,
"size": 536870912
},
"type": "dimm"
}
]
}
There was a problem hiding this comment.
@clarecch sounds good, in that way we can see what slots are available
There was a problem hiding this comment.
@clarecch Yes, indeed. We can even remove the slot assignment from kata. We only need to make sure every time we use different memory device id to hotplug it. QEMU always plugs a new dimm in the first available slot. We either query qemu for available slots, or rely on qmp hotplug failure to detect unavailable dimm slots.
|
PSS Measurement: Memory inside container: |
virtcontainers/container.go
Outdated
|
|
||
| newResources := ContainerResources{ | ||
| VCPUs: uint32(utils.ConstraintsToVCPUs(*resources.CPU.Quota, *resources.CPU.Period)), | ||
| Mem: uint32(*resources.Memory.Limit >> 20), |
There was a problem hiding this comment.
Please tell me what do I need to explain. Why we need Mem field or why we have 20 right shift? ^^
There was a problem hiding this comment.
It is a little unclear as resources.Memory.Limit is an int64 byte value. It would help if ContainerResources.Mem was called ContainerResources.MemMb imho.
There was a problem hiding this comment.
Yes, you are rignt. I will add the MB suffix to ContainerResources.Mem, as well as the oldMem and newMem , which are not graceful.
There was a problem hiding this comment.
@devimc I was a little confused to find ContainerResources.Mem is uint32 type, so I use it in unit of MB, that is the reason why I have 20 ritht shift. I will add the MB suffix to ContainerResources.Mem besides the oldMem and newMem , which will make it easier to understand.
There was a problem hiding this comment.
/cc @miaoyq regarding the int64 / int32 Mem size issue.
virtcontainers/container.go
Outdated
| addMemMB := newMem - oldMem | ||
| // hot add memory must be aligned to 128MB | ||
| if addMemMB%128 != 0 { | ||
| return fmt.Errorf("can not hot add memory which isn't aligned to 128MB") |
There was a problem hiding this comment.
Can you change the error message to something like Memory must be aligned to 128 MB for hotplug ? or hot add ?
|
Annoyingly, the failed CI runs on this PR have somehow "expired" / been deleted, but of course we'll get new results on a re-push once comments have been addressed ;) |
|
@clarecch Could you proceed this w/o the alignment query from guest? It can be fixed in a follow up PR which would depend on querying memory alignment from the guest. That will speed up the development of this feature. |
Do you mean the memory section alignment? I have open a issue for that, which is going to implement querying memory block size from the agent kata-containers/agent#354. Is it exactly what you mean? |
|
@clarecch Yes, that's the issue I wanted to mention. And I was asking not to depend this PR on that one. So you can add most memory online functionality and fix up the alignment requirement in a follow up PR. |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
The jenkins-ci-centos-7-4 failed with: I know the memory constraints failed is because of memory hot remove. But I can't figure out why the first and the third testcase failed. Pls tell me! |
|
@clarecch |
|
Build failed (third-party-check pipeline) integration testing with
|
|
PSS Measurement: Memory inside container: |
This failed because now we have udev and
Update: I have a confusion too, why need the code below in agent's And this updateContainerCpuset fail in some case because it try to update |
|
PSS Measurement: Memory inside container: |
|
The CI is green and conflict is solved. PTAL |
bergwolf
left a comment
There was a problem hiding this comment.
Generally looks good. Just one comment inline.
virtcontainers/container.go
Outdated
| } | ||
|
|
||
| func (c *Container) memHotplugValid(mem uint32) (uint32, error) { | ||
| state, err := c.sandbox.storage.fetchSandboxState(c.sandbox.id) |
There was a problem hiding this comment.
sandbox state is already saved in the sandbox structure. Please use that one if available so we can possibly avoid disk IO here.
There was a problem hiding this comment.
Oh, yes you are right. I will fix it.
|
PSS Measurement: Memory inside container: |
|
PSS Measurement: Memory inside container: |
Fixes #671 agent Shortlog: 7e8e20b agent: add GetGuestDetails gRPC function 5936600 grpc: grpc.Code is deprecated 2d3b9ac release: Kata Containers 1.3.0-rc0 a6e27d6 client: fix dialer after vendor update cd03e0c vendor: update grpc-go dependency 1d559a7 channel: add serial yamux channel close timeout fcf6fa7 agent: update resources list with the right device major-minor number Signed-off-by: Zichang Lin <linzichang@huawei.com>
Add support for using update command to hotplug memory to vm. Connect kata-runtime update interface with hypervisor memory hotplug feature. Fixes #625 Signed-off-by: Clare Chen <clare.chenhui@huawei.com>
Get and store guest details after sandbox is completely created. And get memory block size from sandbox state file when check hotplug memory valid. Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>
|
PSS Measurement: Memory inside container: |
|
The CI is green and conflict is solved. PTAL @jodh-intel @devimc @bergwolf |
bergwolf
left a comment
There was a problem hiding this comment.
LGTM, thanks @clarecch! Once this is merged, could you please add a test in CI to make sure we validate it and avoid breaking the functionality in future?
|
@bergwolf of course |
|
Codecov is stuck. Merge this PR because it updates some tests. If coverage drops, @clarecch please add more tests, thank you! |
As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-Off-By: Penny Zheng <penny.zheng@arm.com>
As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-off-by: Penny Zheng <penny.zheng@arm.com>
As runtime/#624(kata-containers/runtime#624 (comment)) discussed before, the size of memory section is arch-dependent. For arm64, it should be 1G, not 128MB. Fixes: kata-containers#224 Signed-off-by: Penny Zheng <penny.zheng@arm.com>
…rnel-proto-routes network: While updating routes, do not delete routes with proto "kernel"
Add support for using update command to hotplug memory to vm.
Connect kata-runtime update interface with hypervisor memory hotplug
feature.
Fixes #625
Signed-off-by: Clare Chen clare.chenhui@huawei.com