vc: Use BlockIndexMap instead of BlockIndex#2008
vc: Use BlockIndexMap instead of BlockIndex#2008devimc merged 1 commit intokata-containers:masterfrom
Conversation
c3a400e to
f1c9f09
Compare
|
Hi @WeiZhang555 , should I do something for persistence as I've changed the |
|
@darfux You already did this at https://github.com/kata-containers/runtime/pull/2008/files#diff-f433efe7313cabd358c21fa927ca0c2cR33 Theoretically you can't modify/delete existing field, you can only deprecate old field and add a new field, and you have to take good care of the compatibility. But you're lucky that the persistent API haven't been settle down before kata's future 2.0 version, so the change is acceptable so far. And I'm +1 on you change from slices to map 😄 |
WeiZhang555
left a comment
There was a problem hiding this comment.
One small comment. Otherwise LGTM
f1c9f09 to
f67765c
Compare
f67765c to
78505df
Compare
|
I've changed the type to |
|
/test |
Codecov Report
@@ Coverage Diff @@
## master #2008 +/- ##
=========================================
Coverage ? 50.62%
=========================================
Files ? 116
Lines ? 16611
Branches ? 0
=========================================
Hits ? 8409
Misses ? 7174
Partials ? 1028 |
|
Thanks @amshinde . I'm not familiar with the tests repo, should I add a new test script under directory such as https://github.com/kata-containers/tests/tree/master/integration/stability or just add codes to some file that was already aimed to do such thing? |
|
Yes, we you can create a new file there. We do not have any tests for verifying container restarts iirc. |
|
@amshinde when you mean a restart is use, https://docs.docker.com/engine/reference/commandline/restart/ ? if is docker we already have a test, |
|
@jcvenegas I think docker restart doesn't cover the case that restart single container inside a pod. The kubelet would restart an exited container according to the restart policy, and we can test similar thing when integrating with containerd. I'm not sure it can be done at standalone level. |
|
This PR needs more 👀 on it 😄 |
@jodh-intel I'm stuck on how to test container restarting in pod and looking through the tests repo off and on >_< |
d1da2b7 to
c9d909d
Compare
|
/test |
|
Thanks @WeiZhang555 . This PR is waiting for kata-containers/tests#2140 to be merged at first. And could you please restart the failed CI cases ? I'm not sure whether them are related to this PR. Such as |
|
/test |
77deeed to
9cfe120
Compare
|
@jodh-intel Thanks, updated :) |
|
@darfux - great! Ping @kata-containers/runtime - this needs another review folks. |
|
/test |
|
restarting containerd CI |
|
same error in containerd CI, @darfux please take a look |
|
Thanks @devimc . I can pass the same case in my environment (without block-volume, k8s-scale-nginx, uts+ipc-ns, niginx cases due to connection issue). So did I miss something when doing the test or maybe the high load of CI cluster cause the case timed out in 30s? 😂 Here is the log: Details |
|
@darfux ok, let me restart it again |
|
Still failed on sysctl..., I'll try to reproduce it at first. |
|
Hi @devimc , I have found two issues that are related to the case fail probably.
These issues cause the second container of |
|
@darfux thanks for debugging this, let's merge kata-containers/agent#730 and try again |
|
@darfux jfyi |
|
I've unblocked kata-containers/agent#730 - let's keep an eye on that to check when we can progress this.... |
Thanks @devimc for merging the PR. I think we can retry the |
|
/test |
This allows to reuse detached block index and ensures that the index will not reach the limit of device(such as `maxSCSIDevices`) after restarting containers many times in one pod. Fixes: kata-containers#2007 Signed-off-by: Li Yuxuan <liyuxuan04@baidu.com>
9cfe120 to
e9a4658
Compare
|
Persist tests failed to compile due to |
|
/test |
|
Thanks @devimc . The containerd CI finally passes now 🎉 And the remain failed cases were failed on curl or rootfs creating, could you plz restart them again 😉 |
|
@darfux jobs restarted |
|
Both podman and initrd cases now failed on /cc @devimc. Does it a test configuration issue? |
|
@darfux my bad, fixing right away |
|
restarting podman and initrd CIs |
| currentIndex := -1 | ||
| for i := 0; i < maxBlockIndex; i++ { | ||
| if _, ok := s.state.BlockIndexMap[i]; !ok { | ||
| currentIndex = i |
There was a problem hiding this comment.
so, if BlockIndexMap is empty , currentIndex will be set to 0, but in the next lines you are comparing currentIndex == -1 ? is that correct?
There was a problem hiding this comment.
thanks @devimc for the questions . This loop aims to find the first unused index from the index map. If BlockIndexMap is empty, then currentIndex will be the first location of bitmap with value 0. If map[0] to map[maxBlockIndex-1] are all marked as used, it means that there is no more index in the bitmap can be reused, and currentIndex will not be assigned but leave with -1. Then we throw an error in such condition.
| currentIndex := s.state.BlockIndex | ||
| var err error | ||
| currentIndex := -1 | ||
| for i := 0; i < maxBlockIndex; i++ { |
There was a problem hiding this comment.
how about changing this to:
bmLen := len(s.state.BlockIndexMap)
if bmLen > 0 {
currentIndex = bmLen + 1
}
if currentIndex == -1 {
return -1, errors.New("no available block index")
}
This allows to reuse detached block index and ensures that the index will not reach the limit of device(such as
maxSCSIDevices) after restarting containers many times in one pod.I'm not sure whether it is ok to reuse the index for all the block drivers and would this change break the persist feature.
Fixes: #2007
Signed-off-by: Li Yuxuan liyuxuan04@baidu.com