Time synchronization inside LCOW UVM#1119
Conversation
|
Do not merge this before we make the required changes to use the updated LCOW drop. |
| ptpDevPath := filepath.Join("/dev", filepath.Base(ptpDirPath)) | ||
| chronydConfigString := fmt.Sprintf("refclock PHC %s poll 3 dpoll -2 offset 0\n", ptpDevPath) | ||
|
|
||
| chronydConfPath := "/tmp/chronyd.conf" |
There was a problem hiding this comment.
Might as well have this as a global const
There was a problem hiding this comment.
That can hurt readability. If you're reading through code and see a global, you suddenly have to do extra work to figure out where it's being used. If it's only needed in this function then I would just keep it scoped to this function, though it could be moved to a const block in this function.
There was a problem hiding this comment.
const in this function is fine with me. Global const in the communutils package might be a bit odd for this anyways, but truthfully maybe this entire function doesn't belong in communtils looking again
There was a problem hiding this comment.
Oh you had the same feedback here haha #1119 (comment)
| return errors.Wrap(err, "start chronyd command failed") | ||
| } | ||
| go func() { | ||
| waitErr := chronydCmd.Wait() |
There was a problem hiding this comment.
Do we need to handle cases like restarting this if it crashes or is OOM killed?
In the future I think we can use a draft PR for cases like this. |
|
I wonder if there's any way we could run chronyd as a privileged infra container, and avoid needing to do this extra work in the gcs? |
c69305b to
78f1615
Compare
|
@kevpar can you PTAL? (LCOW drop isn't ready yet but it would be good to get this ready to merge by the time we have it) |
Changes to the opengcs to start the chronyd service after UVM boots. Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Add test to verify both chronyd running & disabled cases. Minor fixes in chronyd startup code. Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
78f1615 to
08b0447
Compare
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
cmd/gcs/main.go
Outdated
| } | ||
| // Expected clock name is `hyperv` so read first 6 chars and verify the | ||
| // name | ||
| buf := make([]byte, len(expectedClockName)) |
There was a problem hiding this comment.
Is hyperv the entire contents of the clock_name file? If so may be easier to just use ioutil.ReadFile here.
There was a problem hiding this comment.
Yes. The file we are looking for has only "hyperv" in it. But we loop over a list of directories and check the contents of a file named clock_name in each of those. We don't know what are the contents of other files. Doing ioutil.ReadFile on such files could take too much time and/or memory for no reason. That's why I followed this approach.
There was a problem hiding this comment.
This seems okay, though if there is ever a new clock device added with clock_name of hyperv_newdevicedontuse then we could break. Maybe we should check the file size is what we expect too?
My suggestion to use ioutil.ReadFile was based on an assumption that clock names are probably going to be short. If we think that's safe to rely on then we could still take this approach.
There was a problem hiding this comment.
I have changed this to use ioutil.ReadFile now. There is a small catch though, we now must also look for a new line at the end of the clock name. I have updated the expectedClockName accordingly.
Minor other fixes. Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
Signed-off-by: Amit Barve <ambarve@microsoft.com>
241d4b7 to
2380361
Compare
Related work items: microsoft#1067, microsoft#1097, microsoft#1119, microsoft#1170, microsoft#1176, microsoft#1180, microsoft#1181, microsoft#1182, microsoft#1183, microsoft#1184, microsoft#1185, microsoft#1186, microsoft#1187, microsoft#1188, microsoft#1189, microsoft#1191, microsoft#1193, microsoft#1194, microsoft#1195, microsoft#1196, microsoft#1197, microsoft#1200, microsoft#1201, microsoft#1202, microsoft#1203, microsoft#1204, microsoft#1205, microsoft#1206, microsoft#1207, microsoft#1209, microsoft#1210, microsoft#1211, microsoft#1218, microsoft#1219, microsoft#1220, microsoft#1223
Start time synchronization service in opengcs Changes to the opengcs to start the chronyd service after UVM boots. Signed-off-by: Amit Barve <ambarve@microsoft.com> * Signed-off-by: Amit Barve <ambarve@microsoft.com> * TimeSync service inside LCOW UVM. Add test to verify both chronyd running & disabled cases. Minor fixes in chronyd startup code. Signed-off-by: Amit Barve <ambarve@microsoft.com> * Run Chronyd with restart monitor Signed-off-by: Amit Barve <ambarve@microsoft.com> * Force chronyd to step update time if difference is big Signed-off-by: Amit Barve <ambarve@microsoft.com> * Fixes after rebase Signed-off-by: Amit Barve <ambarve@microsoft.com> * go mod vendor & tidy Signed-off-by: Amit Barve <ambarve@microsoft.com> * Use backoff package instead of manually calculating backoffs Signed-off-by: Amit Barve <ambarve@microsoft.com> * Rename gcs cmdline params, use io.ReadFull instead of io.Read Minor other fixes. Signed-off-by: Amit Barve <ambarve@microsoft.com> * go mod vendor Signed-off-by: Amit Barve <ambarve@microsoft.com> * Ignore err if file doesn't exist Signed-off-by: Amit Barve <ambarve@microsoft.com> * Use ioutil.ReadFile to read clock_name file Signed-off-by: Amit Barve <ambarve@microsoft.com> * minor fix Signed-off-by: Amit Barve <ambarve@microsoft.com> * Remove incorrect usage of backoff.MaxElapsedTime Signed-off-by: Amit Barve <ambarve@microsoft.com>
Currently LCOW UVM doesn't run any time synchronization service which causes a small amount of time lag after running the UVM for a long time (~10 days).
This change updates OpenGCS to start
chronyddaemon as soon as gcs is started. It also adds a new annotation which can be used to disable this service inside the UVM.