identity: Fix nil pointer panic in LookupIdentityByID#13514
Conversation
|
test-me-please |
Fair point. My thinking was that I would restore the behavior present in 1.7/1.8. But assuming there are no issues with deadlocks, I guess we could rely on Edit: Overall, I don't think waiting on |
@gandro the race detector will complain that we are accessing the variable concurrently 😄 |
3cb7f88 to
4566e8f
Compare
|
@aanm PTAL again. After looking a bit more closely at initialization, I decided to do a non-blocking check on the channel. This ensures that we have a proper synchronization point to avoid data races. In the process, I found some additional potential data races and nil pointer panics that I fixed. I decided to fail the lookup instead of waiting for the caches to be initialized. If we wait for full initialization including synchronization, then both the DNS Proxy (call site) and Hubble (call site) would not be operational until we have synchronized the identity cache. I don't think we want that. If callers need to perform a lookup on a synchronized cache, they should call |
|
test-me-please |
|
GKE hit a flake This does not seem related |
|
test-gke |
|
Hit #13554 twice |
Because the identity allocator is initialized asychronously via `InitIdentityAllocator`, the local identitiy allocator might not have been initialized yet when the lookup functions are called. This can cause nil pointer panics, as observed in #13479. Before b194612, this nil pointer panic could not occur in `LookupIdentityByID` as the function checked for `m.IdentityAllocator != nil` which also implies `m.localIdentities != nil`. This commit adds an explict check for `m.localIdentities` and fixes a potential data race by checking the initialization channels before accessing `m.localIdentities` or `m.IdentityAllocator`. Fixes: #13479 Fixes: b194612 ("identity: Avoid kvstore lookup for local identities") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
The local and global allocator are initialized asynchronously in
`InitIdentityAllocator`. Therefore we must check for the initialization
channels a potential synchronization point before we can check for
`m.{localIdentities,IdentityAllocator} != nil`.
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
4566e8f to
87104cc
Compare
|
Rebased to pick up the fix the the GKE flake this observed. Rerunning CI |
|
test-me-please |
|
retest-net-next |
|
Potentially new net-next flake? https://jenkins.cilium.io/job/Cilium-PR-K8s-1.12-net-next/872/testReport/junit/Suite-k8s-1/12/K8sServicesTest_Checks_service_across_nodes_Tests_NodePort_BPF_Tests_with_direct_routing_Tests_NodePort_with_Maglev_Tests_NodePort/ Edit: Created issue after discussing on Slack #13588 |
|
retest-net-next |
Because the identity allocator is initialized asynchronously via
InitIdentityAllocator, its internal local identitiy allocator mightnot have been initialized yet when the lookup functions are called.
This can cause nil pointer panics, as observed in #13479.
Before b194612, this nil pointer panic could not occur in
LookupIdentityByIDas the function checked form.IdentityAllocator != nilwhich also implies
m.localIdentities != nil.This commit adds an explict check for
m.localIdentities.Fixes: #13479
Fixes: b194612 ("identity: Avoid kvstore lookup for local identities")
Signed-off-by: Sebastian Wicki sebastian@isovalent.com
Note: While the backported version of b194612 in v1.7 and v1.8 does not suffer from the nil pointer panic due to an early check fo
m.IdentityAllocator != nil, I think we should still back-port this to fix the potential nil-pointer panic inLookupIdentitywhich is also fixed by this commit.