Skip to content
This repository was archived by the owner on May 12, 2021. It is now read-only.

chrony: Configure chrony to start only when /dev/ptp0 exists.#265

Merged
jcvenegas merged 1 commit intokata-containers:masterfrom
amshinde:configure-chrony-systemd
Jun 21, 2019
Merged

chrony: Configure chrony to start only when /dev/ptp0 exists.#265
jcvenegas merged 1 commit intokata-containers:masterfrom
amshinde:configure-chrony-systemd

Conversation

@amshinde
Copy link
Copy Markdown
Member

Hypercall to implement virtual PTP was introduced in kernel 4.10
Have chrony run only if the device created by kvm-ptp exists.
Add this as a ConditionExists in the systemd service file.

This service if named as chrony.service in deb based distributions
rather than chronyd.service, although a systemd alias exists.
However it is not possible to come up with a generic PATH systemd
unit relying on the alias.

Signed-off-by: Archana Shinde archana.m.shinde@intel.com

chrony_systemd_service="${ROOTFS_DIR}/lib/systemd/system/chrony.service"
fi

sed -i 's/^Unit/&ConditionPathExists=/dev/ptp0/' ${chrony_systemd_service}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to work as you have slashes in the replacement text? Maybe:

sed -i 's!^Unit!&ConditionPathExists=/dev/ptp0!' ${chrony_systemd_service}

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jodh-intel Looks like I missed pushing my latest changes. I have addressed this. PTAL.

@amshinde amshinde force-pushed the configure-chrony-systemd branch from 0e1b0eb to 7727f3b Compare March 26, 2019 17:29
@amshinde amshinde changed the title wip : chrony: Configure chrony to start only when /dev/ptp0 exists. chrony: Configure chrony to start only when /dev/ptp0 exists. Mar 26, 2019
@amshinde
Copy link
Copy Markdown
Member Author

/test

Copy link
Copy Markdown
Member

@jcvenegas jcvenegas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting complex related with this service, is it possible to create our own service file to launch chrony ? Our own service file version could be part of the agent set of services.

chrony_systemd_service="${ROOTFS_DIR}/lib/systemd/system/chrony.service"
fi

sed -i '/^\[Unit\]/a ConditionPathExists=\/dev\/ptp0' ${chrony_systemd_service}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responding to @jcvenegas's thought about creating our own unit, I think the way systemd expects use to handle this is by using a snippet. Something like:

$ sudo mkdir -p /etc/systemd/system/chronyd.service.d/
$ cat <<EOT | sudo tee /etc/systemd/system/chronyd.service.d/condition-path-exists.conf
[Unit]
ConditionPathExists=/dev/ptp0
EOT

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice alternative, just wonder if this could be distro agnostic better! :)

@Pennyzct
Copy link
Copy Markdown
Contributor

Pennyzct commented Jun 11, 2019

Hi~ all @jodh-intel @amshinde
ARM CI failed on dmesg log test from docker suite.

09:22:27 • Failure [6.683 seconds]
09:22:27 check dmesg logs errors
09:22:27 /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:277
09:22:27   Run to check dmesg log errors
09:22:27   /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:294
09:22:27     should be empty [It]
09:22:27     /home/jenkins/workspace/kata-containers-runtime-ARM-18.04-PR/go/src/github.com/kata-containers/tests/integration/docker/run_test.go:295
09:22:27 
09:22:27     Expected
09:22:27         <string>: [    0.000000] systemd[61]: chronyd.service: Failed to set up mount namespacing: No such file or directory
09:22:27         [    0.000000] systemd[61]: chronyd.service: Failed at step NAMESPACE spawning /usr/sbin/chronyd: No such file or directory
09:22:27     to be empty

Since kvm-ptp isn't supported on AArch64 for now, chrnoy service couldn't be working well on the guest.
But I think this PR could fix this failure and I have already tested it on AArch64. ;) thanks! @amshinde
@grahamwhaley Since we already have two ARM CI nodes, could we add it to the osbuilder/ repo? ;)

@grahamwhaley
Copy link
Copy Markdown
Contributor

@Pennyzct - you mean enable the ARM CI to track the PRs on the osbuilder repo - I don't see why not - we already have x86 and Power on there.
You are OK to create that new Jenkins job yourself @Pennyzct ?
/cc @chavafg @GabyCT

@Pennyzct
Copy link
Copy Markdown
Contributor

Pennyzct commented Jun 11, 2019

@grahamwhaley I'm not sure that I have the authority to do this job and is there any tutorial?? 😢 sooooorry, I'm a really rookie in this field.

@grahamwhaley
Copy link
Copy Markdown
Contributor

Hi @Pennyzct - ah, indeed, you have to be in the github kata 'jenkins-admin' team to have jenkins master superpowers.... If you would like to join the admin team, let's discuss with @chavafg
The basic workflow in Jenkins is:

  • create a new freestyle job, cloning that from the 'nearest' existing job to what you want
  • go through all the dialogs and 'advanced' dialogs and text/script boxes of the new job config and basically doing `s/old job stuff/new job stuff/'. For instance, we would probably copy the proxy ARM job, and then change all occurrences of 'proxy' in the dialogs to 'osbuilder'.

There are a few special corner cases, such as all jobs probably mention the test repo at some point in their configuration, and when configuring a job for the test repo you need to stare harder to make sure they are all correct :-) But mostly it is a copy/paste/edit of an existing job.

Let me go make an ARM osbuilder job for you...

@grahamwhaley
Copy link
Copy Markdown
Contributor

OK @Pennyzct - job created: http://jenkins.katacontainers.io/job/kata-containers-osbuilder-ARM-18.04-PR/
Looks like I can't kick off a test build, as it needs a PR number - so we can either wait for the next PR, or we could create a test PR if we really needed.

@grahamwhaley
Copy link
Copy Markdown
Contributor

and then one triggered anyway - let's see how this goes @Pennyzct http://jenkins.katacontainers.io/job/kata-containers-osbuilder-ARM-18.04-PR/2/console

@chavafg
Copy link
Copy Markdown
Contributor

chavafg commented Jun 11, 2019

@amshinde can you add a Fixes: #... to the commit message so that the CI can properly run?

@amshinde amshinde force-pushed the configure-chrony-systemd branch from 7727f3b to 4fce094 Compare June 11, 2019 17:13
@amshinde
Copy link
Copy Markdown
Member Author

/test

@Pennyzct
Copy link
Copy Markdown
Contributor

@grahamwhaley thanks for the detailed instruction and the created ARM osbuilder job. ;). happy to have the super power, 嘿嘿🤩.

@egernst
Copy link
Copy Markdown
Member

egernst commented Jun 12, 2019

CI appears to hate your PR, @amshinde

@amshinde
Copy link
Copy Markdown
Member Author

@egernst Yes, all of them seem to be failing for different reasons.
Retesting, before I go look what the issue is.

@amshinde amshinde force-pushed the configure-chrony-systemd branch from 4fce094 to 9c97548 Compare June 13, 2019 21:53
@amshinde
Copy link
Copy Markdown
Member Author

/test

@Pennyzct
Copy link
Copy Markdown
Contributor

Pennyzct commented Jun 14, 2019

Hi~ @amshinde
ARM keeps failing on the same spot. ;)

ERROR: Failed to run command /usr/bin/git rev-list --no-merges --reverse origin/master..9c97548cb3379ecafa6e62d08799e984a55dc632: exit status 128 (stdout: , stderr: fatal: Invalid revision range origin/master..9c97548cb3379ecafa6e62d08799e984a55dc632
)

Does it need to be re-based? @grahamwhaley @chavafg

@grahamwhaley
Copy link
Copy Markdown
Contributor

@Pennyzct hmm, not sure. Only thing I can think of maybe is if the ARM jenkins slave is not clearing out its workspace, or is using a GOPATH/git repo that is outside of the jenkins WORKSPACE path, so maybe it is trying to pull on an old tree or something??

Any chance you can try to reproduce by hand @Pennyzct ?

@amshinde
Copy link
Copy Markdown
Member Author

@jcvenegas @chavafg I need your help here. I am seeing this error in the CI:

INFO: Created summary file '/var/lib/osbuilder/osbuilder.yaml' inside rootfs
+ exit 0
touch /tmp/osbuilder-test.xVdDrfd/rootfs-osbuilder/.debian_rootfs.done
+ die 'Background build job failed'
+ msg='Background build job failed'
+ echo 'ERROR: Background build job failed'
ERROR: Background build job failed
+ exit 1
+ exit_handler
+ '[' 1 -eq 0 ']'
+ info 'ERROR: test failed'
+ s='ERROR: test failed'
+ echo -e 'INFO: ERROR: test failed\n'
INFO: ERROR: test failed

When I run the ci locally as .ci/run.sh, I am consistently seeing this behaviour:

[OK] Agent installed                                                                                                                                                                                        
INFO: Check init is installed                                                                                                                                                                               
[OK] init is installed                                                                                                                                                                                      
INFO: Creating summary file                                                                                                                                                                                 
INFO: Created summary file '/var/lib/osbuilder/osbuilder.yaml' inside rootfs                                                                                                                                
Sending build context to Docker daemon  41.47kB                                                                                                                                                             
Step 1/3 : From fedora:latest                                                                                                                                                                               
latest: Pulling from library/fedora                                                                                                                                                                         
8f6ac7ed4a91: Pull complete                                                                                                                                                                                 
Digest: sha256:9c78c69f748953ba8fdb6eb9982e1abefe281d9b931a13f251eb8aec988353de                                                                                                                             
Status: Downloaded newer image for fedora:latest                                                                                                                                                            
 ---> 289289d1a15b                                                                                                                                                                                          
Step 2/3 : RUN [ -n "$http_proxy" ] && sed -i '$ a proxy='$http_proxy /etc/dnf/dnf.conf ; true                                                                                                              
 ---> Running in db4c239044c6                                                                                                                                                                               
Removing intermediate container db4c239044c6                                                                                                                                                                
 ---> 75ba7bfe3e18                                                                                                                                                                                          
Step 3/3 : RUN dnf install -y qemu-img parted gdisk e2fsprogs gcc xfsprogs findutils                                                                                                                        
 ---> Running in ac7c695a193e                                                                                                                                                                               
Fedora Modular 30 - x86_64                      1.5 MB/s | 2.7 MB     00:01                                                                                                                                 
Fedora Modular 30 - x86_64 - Updates            2.7 MB/s | 2.0 MB     00:00                                                                                                                                 
Fedora 30 - x86_64 - Updates                    7.3 MB/s |  14 MB     00:01                                                                                                                                 
Fedora 30 - x86_64                               16 kB/s | 3.4 kB     00:00                                                                                                                                 
Failed to synchronize cache for repo 'fedora'                                                                                                                                                               
Error: Failed to synchronize cache for repo 'fedora'                                                                                                                                                        
The command '/bin/sh -c dnf install -y qemu-img parted gdisk e2fsprogs gcc xfsprogs findutils' returned a non-zero code: 1                                                                                  
Failed at 121: sudo -E AGENT_INIT="${AGENT_INIT}" USE_DOCKER=true ./image-builder/image_builder.sh "$ROOTFS_DIR"                                                                                            
Failed at 18: "${cidir}/install_kata_image.sh"                                                                                                                                                              
+ exit_handler                                                                                                                                                                                              
+ '[' 1 -eq 0 ']'                                                                                                                                                                                           
+ info 'ERROR: test failed'                                                                                                                                                                                 
+ s='ERROR: test failed'                                                                                                                                                                                    
+ echo -e 'INFO: ERROR: test failed\n'                                                                                                                                                                      
INFO: ERROR: test failed                                                                                                                                                                                    
                                                                                                                                                                                                            
+ '[' -d /tmp/osbuilder-test.6OINQxE/rootfs-osbuilder ']'                                                                                                                                                   
+ info 'no rootfs created'                                                                                                                                                                                  
+ s='no rootfs created'                                                                                                                                                                                     
+ echo -e 'INFO: no rootfs created\n'                                                                                                                                                                       
INFO: no rootfs created 

The failure being: Failed to synchronize cache for repo 'fedora'
Can you reproduce this? The CI does not give any clue of what failed.

@chavafg
Copy link
Copy Markdown
Contributor

chavafg commented Jun 14, 2019

@amshinde trying to reproduce... will get back when I have something.

The CI failure means that the rootfses were not built correctly, but I cannot see exactly were they failed.

@amshinde amshinde force-pushed the configure-chrony-systemd branch from 9c97548 to a42c651 Compare June 14, 2019 19:13
@amshinde
Copy link
Copy Markdown
Member Author

/test

@amshinde
Copy link
Copy Markdown
Member Author

/test

@jodh-intel
Copy link
Copy Markdown

I think the Travis job is failing here (but don't know why):

/home/travis/gopath/src/github.com/kata-containers/agent /
+ make install DESTDIR=/rootfs INIT=no SECCOMP=
rm -f kata-agent kata-agent.service
+ make INIT=no
install -D kata-agent /rootfs/usr/bin/kata-agent
install: cannot stat 'kata-agent': No such file or directory

@jodh-intel
Copy link
Copy Markdown

Buried in the proxy log artifact in Jenkins, I found this:

[    1.069551] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc2.d, not generating chrony.service.
[    1.070131] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc3.d, not generating chrony.service.
[    1.070516] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc4.d, not generating chrony.service.
[    1.070889] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc5.d, not generating chrony.service.
[\x1b[0;1;31m!!!!!!\x1b[0m] Failed to isolate default target, freezing.

/cc @amshinde

@jodh-intel
Copy link
Copy Markdown

@amshinde - any progress on debugging this issue?

@amshinde
Copy link
Copy Markdown
Member Author

@jodh-intel Thanks for your pointer. I havent had the chance to look a great deal at this yesterday. I'll try to debug this today.
Whats concerning is, I see a similar failure on a test PR that I raised a couple of days back.
#315

So, it looks like this is failing on the master as well.

@marcov
Copy link
Copy Markdown
Contributor

marcov commented Jun 20, 2019

Buried in the proxy log artifact in Jenkins, I found this:

[    1.069551] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc2.d, not generating chrony.service.
[    1.070131] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc3.d, not generating chrony.service.
[    1.070516] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc4.d, not generating chrony.service.
[    1.070889] systemd-sysv-generator[69]: Ignoring S01chrony symlink in rc5.d, not generating chrony.service.
[\x1b[0;1;31m!!!!!!\x1b[0m] Failed to isolate default target, freezing.

/cc @amshinde

@amshinde in case you missed, #319 should be the fix for this.

@amshinde
Copy link
Copy Markdown
Member Author

@amshinde in case you missed, #319 should be the fix for this.

Thanks @marcov! Thats one thing out of my todo today :)
Thanks for fixing it. I'll rebase once your PR is merged.

@amshinde amshinde force-pushed the configure-chrony-systemd branch from e31e4f9 to 7d615f1 Compare June 20, 2019 20:33
@amshinde
Copy link
Copy Markdown
Member Author

/test

Hypercall to implement virtual PTP was introduced in kernel 4.10
Have chrony run only if the device created by kvm-ptp exists.
Add this as a ConditionExists in the systemd service file.

This service if named as chrony.service in deb based distributions
rather than chronyd.service, although a systemd alias exists.
However it is not possible to come up with a generic `PATH` systemd
unit relying on the alias.

Fixes kata-containers#308

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
@amshinde amshinde force-pushed the configure-chrony-systemd branch from 7d615f1 to d2e80f5 Compare June 20, 2019 22:30
@amshinde
Copy link
Copy Markdown
Member Author

/test

@jcvenegas jcvenegas merged commit 0e0e74b into kata-containers:master Jun 21, 2019
Pennyzct added a commit to Pennyzct/osbuilder that referenced this pull request Aug 1, 2019
commit 39370c2(https://github.com/kata-containers/osbuilder/commit/
39370c2) has accidentally deleted the
content in PR#265(kata-containers#265).
Here, I just re-patch PR#265 on the latest master code.

Fixes: kata-containers#338

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants