CSI Node info registration in kubelet#67684
CSI Node info registration in kubelet#67684k8s-github-robot merged 2 commits intokubernetes:masterfrom
Conversation
|
/sig storage |
f87962d to
c71f9cb
Compare
There was a problem hiding this comment.
Will rename all the receivers to nim.
There was a problem hiding this comment.
Will check set equality instead in the new version that uses CSINodeInfo.
c71f9cb to
b2ff926
Compare
|
Feel free to hold off on reviewing this PR for now - I'm almost done with wiring up CSINodeInfo and a good amount of logic will become obsolete |
b2ff926 to
ab5b04e
Compare
|
Added integration with Missing functionality is updated in the top post, but adding those won't cause any major structural changes. The PR is ready for review now. @liggitt @saad-ali @vladimirvivien @sbezverk @msau42 @gnufied |
ab5b04e to
0128473
Compare
|
@saad-ali and I discussed offline, and we are going to leave the CSI label prefix as is in Kubernetes instead of reversing it, for the time being. |
There was a problem hiding this comment.
This comment needs to be updated. I don't see any channels being used
There was a problem hiding this comment.
If a plugin doesn't support topology will this be nil?
There was a problem hiding this comment.
Yeah, thanks for the catch!
There was a problem hiding this comment.
Can you add a comment here describing exactly all the objects/fields that are being modified/added?
There was a problem hiding this comment.
are there plans to use a shared informer? If so, we should clone the node object first before modifying it so it doesn't impact other uses that may be sharing the same pointer.
There was a problem hiding this comment.
No I wasn't planning to. Driver registration is fairly infrequent so watching the object doesn't seem worth it
There was a problem hiding this comment.
This works only if the multiple updateFuncs are not conflicting with each other.
There was a problem hiding this comment.
I'll make it more explicit in the comments for this function that because updateFuncs are called in order, each updateFunc should consider the effects of previous updateFuncs
There was a problem hiding this comment.
each updateFunc should consider the effects of previous updateFuncs
what does that mean?
There was a problem hiding this comment.
I added a comment here - does that help or should it be more clear?
https://github.com/kubernetes/kubernetes/blob/9c855fb9e62de46f511d9dc0446425c1b5a76499/pkg/volume/csi/nodeinfomanager/nodeinfomanager.go#L119
There was a problem hiding this comment.
You mean, if two update functions both update same field then last one will win, but caller of this function should try and ensure that - it does not happen in first place?
There was a problem hiding this comment.
Well said! Will update the comment
There was a problem hiding this comment.
Should we be adding all drivers in this call to just update a single driver? Do all the CSI drivers register again when we upgrade the node?
Should we be looking at the annotations at all? IIUC, this is supposed to be the replacement for the annotations
There was a problem hiding this comment.
Yeah CSI drivers should register again upon node upgrade, and this is here as a safeguard to ensure annotations and CSINodeInfo are consistent while we support both. There is always a chance that kubelet is upgraded while the previous Node object stays, and annotations and CSINodeInfo could be momentarily out of sync.
There was a problem hiding this comment.
Is it necessary to keep the annotations and CSIDriverInfo in sync on the first driver's registration, if the other driver will register again later? In an upgrade scenario, then this means we're on the first boot after upgrade and we're processing the first driver registration. We're adding an entry for all the other drivers too even though they have not gone through registration yet. Could that cause other issues?
There was a problem hiding this comment.
I think this method should get whole list of registered CSI drivers so it creates/updates complete CSINodeInfo with all drivers. The result should be the same in the end, however it feels more robust. I don't line CSINodeInfo with empty topology for some drivers just because we translated it from an annotation.
There was a problem hiding this comment.
This method gets called for a single driver registration though, and only on the first one since it's creating the object for the first time.
There was a problem hiding this comment.
Discussed with @msau42 offline. The current implementation doesn't guarantee the node info is always up-to-date. In particular, with the current kubelet plugin registration design, there is no way to remove a non-existent driver unless it detects that a driver socket is removed (after unregistration is implemented). If a driver is removed while kubelet is down, the node information will remain. So whatever issue comes up when CSINodeInfo is kept in sync with annotation is unavoidable today.
But I'll remove the logic to keep the two sources in sync, because it complicates the implementation without adding too much benefit. One minor advantage of keeping them in sync is to avoid debugging confusion, but I think it'll be OK.
pkg/volume/csi/csi_plugin.go
Outdated
There was a problem hiding this comment.
Previously the lock was being held across the entire function instead of just the map update. Could there be any issues if this map has an entry but we haven't finished registering the plugin yet? Should we update the map at the very end?
There was a problem hiding this comment.
In both the previous and the current version, the lock is only held on write, all within this RegistrationCallback() function. The map has to be updated before the csi.NodeGetInfo() call unfortunately because it relies on the driver info to call out to gRPC.
However, when plugin unregistration is implemented in the future, depending on how it's implemented, there could be a race between calls of RegistrationCallback(). Will update to lock the entire function once.
There was a problem hiding this comment.
Discussed with @msau42 offline. For now we'll keep the map locked as short as possible. When unregistration is implemented we'll reconsider again.
Will add comments explaining why it's OK to keep locking short.
|
Missing node info manager tests for when the flag gate is disabled. I'm on it. |
50db873 to
db19252
Compare
…ing to CSINodeInfo; added unit tests for node updates; updated RBAC, NodeAuthorizer, NodeRestriction.
db19252 to
e966eb1
Compare
|
/lgtm |
There was a problem hiding this comment.
Looks like the unit test is failing because this feature flag is enabled by default now. Is this still the right flag that determines if the node needs these permissions, or is it fenced by the CSINodeInfo flag as well now?
If this is correct as is, the fixture for the unit test just needs regenerating (instructions are in the unit test failure message)
There was a problem hiding this comment.
That's one spot I missed. Thanks for the catch!
|
one comment on the gate being checked by the RBAC policy change and the unit test failure, authz and admission changes lgtm otherwise it's worth noting that the removal of driver info from the node depends on observing the removal of the driver (rather than reconciling to current observed state), and doesn't handle errors in removing driver info by retrying. That doesn't necessarily block this PR, but seems important to resolve before this graduates. |
|
To clarify, in this PR driver is only unregistered when there's an error adding the driver. Driver unregistration will be added in the next release. The rest of the CSI system should always assume node info is somewhat out of date, since kubelet can't provide the guarantee that updates are reflected immediately anyway, in the best case. Does the same logic apply to driver addition as well? |
e966eb1 to
a4f339a
Compare
|
Updated RBAC test |
|
Keeping a laundry list of considerations for CSI kubelet driver registration moving forward: |
|
/approve RBAC, node authz and admission changes lgtm |
|
/retest |
|
The GCE e2e failure is likely caused by this PR. Taking a look |
a4f339a to
94d649b
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: liggitt, msau42, saad-ali, verult The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@verult: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest Review the full test history for this PR. Silence the bot with an |
|
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. |
|
Per #68519 (comment): installing CRD outside k8s core brings back issue #2:
I side with @msau42 that if the |
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #67683
Special notes for your reviewer:
Feature issue: kubernetes/enhancements#557
Design doc: kubernetes/community#2034
Missing pieces:
An RBAC rule is also added to support external-provisioner topology updates.
Release note: