Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

arm64: support NVDIMM#1323

Merged
jodh-intel merged 2 commits intokata-containers:masterfrom
Pennyzct:nvdimm
Mar 12, 2019
Merged

arm64: support NVDIMM#1323
jodh-intel merged 2 commits intokata-containers:masterfrom
Pennyzct:nvdimm

Conversation

@Pennyzct
Copy link
Copy Markdown
Contributor

@Pennyzct Pennyzct commented Mar 5, 2019

As Arm64 is using the block device as the rootfs in guest. If we run two or more kata-container instances
on one host, we will get following error:

root@entos-thunderx2-desktop:~# docker run -dt --runtime kata-runtime ubuntu
7baf2c0f26100b0e642ad4122ce9ea1cb556fb3ed6cfcb454cb0586cbbe6194d
root@entos-thunderx2-desktop:~# docker run -dt --runtime kata-runtime ubuntu
2065fcf560c136091b5e69b6a12c95de94308309bb7fd5309df852503752203a
docker: Error response from daemon: OCI runtime create failed: qemu-system-aarch64: -device virtio-blk,drive=image-9f100592ac95eec6,scsi=off,config-wce=off: Failed to get "write" lock
Is another process using the image?: unknown.

Thanks to the new expanded IPA_SIZE feature in kernel 4.20 and Eric Auger's patch set ARM virt: Initial RAM expansion and PCDIMM/NVDIMM support in qemu(which are still under upstream review), we could fully support nvdimm on arm64.

@Pennyzct Pennyzct changed the title arm64: support arm64: support NVDIMM Mar 5, 2019
@Pennyzct
Copy link
Copy Markdown
Contributor Author

Pennyzct commented Mar 5, 2019

I'm not triggering ARM CI here for now, since this pr needs to wait packaging/#377 landed firstly.

if err != nil {
return nil, err
}
defer func() { _ = imageFile.Close() }()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: You could make this simply:

defer imageFile.Close()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rignt, right, right. ;) update asap.

Copy link
Copy Markdown
Contributor

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @Pennyzct Nice to see the test case added ;-)
lgtm

Copy link
Copy Markdown

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!
Please address comment from @jodh-intel and I have one comment regarding the commit qemu-arm64: disable dax on arm64. You should remove it, since you could remove the dax part from the initial commit.

@Pennyzct
Copy link
Copy Markdown
Contributor Author

Pennyzct commented Mar 6, 2019

@jodh-intel @sboeuf already updated. ;) ptal

Copy link
Copy Markdown

@sboeuf sboeuf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @Pennyzct

@sboeuf
Copy link
Copy Markdown

sboeuf commented Mar 6, 2019

/test

Original guest image was reprensented as block device in qemu-aarch64,
and it will bring up write lock error when running multiple containers.
Thanks to the new expanded IPA_SIZE feature in kernel 4.20 and
Eric Auger's related patch set in qemu(which are still under upstream
review), we could fully support nvdimm on arm64.

Depends-on: github.com/kata-containers/packaging#377

Fixes: kata-containers#843

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
@Pennyzct
Copy link
Copy Markdown
Contributor Author

Pennyzct commented Mar 7, 2019

/test

Since we overrided the func appendImage for aarch64, we should also
provide related unit test.

Fixes: kata-containers#843

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
@Pennyzct
Copy link
Copy Markdown
Contributor Author

Pennyzct commented Mar 7, 2019

/test

@Pennyzct
Copy link
Copy Markdown
Contributor Author

Pennyzct commented Mar 8, 2019

Hi~ all. @grahamwhaley @jodh-intel @sboeuf a few sad news and one great news.
Great news is that ARM CI got green~~~
But I got dozens of build timeout in multiple x86_64 CIs, and it seems like a pattern to me. you may want to have a look. ;)

•Build timed out (after 5 minutes). Marking the build as aborted.
Build was aborted
Performing Post build task...
Match found for :.* : True
Logical operation result is TRUE
Running script  : #!/bin/bash

@grahamwhaley
Copy link
Copy Markdown
Contributor

Hi @Pennyzct \o/ for the ARM CI :-)
The timeouts - yes, I think we have had a bunch of those overnight - I think it may be related to an image file/clearlinux update that happened - I think @chavafg and @jcvenegas are looking into it. Thanks for reporting!

@grahamwhaley
Copy link
Copy Markdown
Contributor

/test
This looks to be very ARM specific, and the ARM CI is happy. We know the x86 ones had some wobbles overnight - I'm probably happy to merge this as is. @sboeuf wdyt?
I'll kick the CIs off again anyway, as the 'wobbles' should be fixed now - but we could force merge without waiting if we wanted...

@sboeuf
Copy link
Copy Markdown

sboeuf commented Mar 8, 2019

@grahamwhaley I agree this is ARM specific, but the code being modified is in virtcontainers. There could be some unexpected relation with the common code that could cause some failures. I would feel more confident to merge this if we had at least our main jenkins-ci-ubuntu-18-04 job passing.

@jodh-intel
Copy link
Copy Markdown

Fedora CI failed with:

not ok 32 ctr with non-root user has no effective capabilities

Fedora vsocks CI failed with:

docker: Error response from daemon: OCI runtime create failed: Failed to check if grpc server is working: context deadline exceeded: unknown.
And we have hit the max 20 containers
Checking 20 containers have all relevant components
Wrong number of shims running (20 != 19) - stopping
Wrong number of netmons running (20 != 19) - stopping
Wrong number of 'runtime list' containers running (20 != 19) - stopping
Wrong number of pods in /var/lib/vc/sbs (20 != 19) - stopping)
Showing system state:

Let's re-spin the wheels and see if the errors are consistent...

@amshinde
Copy link
Copy Markdown
Member

/test

Copy link
Copy Markdown
Member

@amshinde amshinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Pennyzct
Copy link
Copy Markdown
Contributor Author

/test

@jodh-intel
Copy link
Copy Markdown

2 failing CI's are known issues, so merging...

@jodh-intel jodh-intel merged commit ffbae64 into kata-containers:master Mar 12, 2019
@Pennyzct
Copy link
Copy Markdown
Contributor Author

Hi~@jodh-intel this pr is tested on ARM CI with the dependency of packaging#377, for now, packaging#377 hasn't been merged yet, I'm afraid ARM ci could break down.
But I see that packaging#377 has got all green, maybe could we merge it?? ;)

@jodh-intel
Copy link
Copy Markdown

@Pennyzct - done ;)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants