Tests: Ginkgo test framework#1733
Conversation
There was a problem hiding this comment.
should omit type map[string]string from declaration of var DefaultSettings; it will be inferred from the right-hand side
There was a problem hiding this comment.
comment on exported method CmdRes.KVOutput should be of the form "KVOutput ..."
There was a problem hiding this comment.
exported method CmdRes.FindResults should have comment or be unexported
There was a problem hiding this comment.
comment on exported method CmdRes.Correct should be of the form "Correct ..."
There was a problem hiding this comment.
exported type CmdRes should have comment or be unexported
There was a problem hiding this comment.
comment on exported method CmdRes.KVOutput should be of the form "KVOutput ..."
There was a problem hiding this comment.
exported method CmdRes.FindResults should have comment or be unexported
There was a problem hiding this comment.
comment on exported method CmdRes.Correct should be of the form "Correct ..."
There was a problem hiding this comment.
exported type CmdRes should have comment or be unexported
There was a problem hiding this comment.
comment on exported function GetScope should be of the form "GetScope ..."
There was a problem hiding this comment.
comment on exported function FIt should be of the form "FIt ..."
There was a problem hiding this comment.
comment on exported function It should be of the form "It ..."
There was a problem hiding this comment.
exported type AfterAll should have comment or be unexported
e809277 to
fdbe0c2
Compare
|
I cannot tell you how excited I am for this! My review is WIP - given the size of this, it will take a while. |
|
Hello! A bunch of test are duplicated, I made the list on the issue #1589 where I map all the test to there. So if that it's not clear to you I can map 1:1 with the bash test. Regards |
73f7778 to
994e204
Compare
There was a problem hiding this comment.
Can this package be called in a more descriptive way? Like k8sTests?
There was a problem hiding this comment.
Would be nice to have this named "RuntimeTests"
There was a problem hiding this comment.
Nitpick: "It Default values" doesn't really give much information. What about It("should disable policy enforcement with defauklt values")? It can be done later as I understand that this is direct conversion from bash scripts.
joestringer
left a comment
There was a problem hiding this comment.
I noted a few minor issues here from an initial scan through.
| ciliumPod, err := kubectl.GetCiliumPodOnNode("kube-system", "k8s1") | ||
| Expect(err).Should(BeNil()) | ||
|
|
||
| //Check that cilium detects a |
There was a problem hiding this comment.
Incomplete comment? Also below.
| if [[ "$(hostname)" == "k8s1" ]]; then | ||
| make docker-image-dev | ||
| docker tag cilium 192.168.36.11:5000/cilium/cilium-dev | ||
| docker push 192.168.36.11:5000/cilium/cilium-dev |
There was a problem hiding this comment.
Can we refactor these hard-coded IPs to a single place?
There's a few more of these, I won't highlight every one.
There was a problem hiding this comment.
The problem is that these are used in k8s manifests as docker registry addresses. Could we put this ip into vm /etc/hosts and use hostname in k8s manifests?
| Expect(res.Correct()).Should(BeTrue()) | ||
| }, 300) | ||
|
|
||
| It("Test containers connectivity WITH policy", func() { |
There was a problem hiding this comment.
Perhaps we could share some of the common test code between this and Test containers connectivity without policy?
| @@ -1,3 +1,3 @@ | |||
| PATH=/usr/lib/llvm-3.8/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin | |||
| CILIUM_OPTS=--kvstore consul --kvstore-opt consul.address=127.0.0.1:8500 | |||
| CILIUM_OPTS=--kvstore consul --kvstore-opt consul.address=127.0.0.1:8500 --debug | |||
There was a problem hiding this comment.
Will this affect all deployments, or just in the testing? Not sure if this is unrelated change that should be kept out.
There was a problem hiding this comment.
My fault! I'll remove it ;-)
nebril
left a comment
There was a problem hiding this comment.
Overall looks like a great job! Left some questions and suggestions inline.
| log "github.com/sirupsen/logrus" | ||
| ) | ||
|
|
||
| var _ = Describe("RunConnectivyTest", func() { |
| sh './tests/k8s/start.sh' | ||
| } | ||
| "Runtime":{ | ||
| sh 'cd ${TESTDIR}; ginkgo --focus="Run*" -v -noColor' |
There was a problem hiding this comment.
Nit: can we change runtime test name convention from Run<test name> to Runtime<test name>?
There was a problem hiding this comment.
Would it be possible to get coverage results of the Cilium commands using ginkgo -cover ? Or is that not possible since we are running cilium commands via its CLI and are not actually testing Cilium code directly with Ginkgo?
| ) | ||
|
|
||
| var _ = Describe("K8s", func() { | ||
| // Describe("Categorizing book length", func() { |
There was a problem hiding this comment.
Do we need to keep this example? There are plenty of tests to check for examples, I would like this file gone. If we really want it to stay, I think it should be uncommented.
| if [[ "$(hostname)" == "k8s1" ]]; then | ||
| make docker-image-dev | ||
| docker tag cilium 192.168.36.11:5000/cilium/cilium-dev | ||
| docker push 192.168.36.11:5000/cilium/cilium-dev |
There was a problem hiding this comment.
The problem is that these are used in k8s manifests as docker registry addresses. Could we put this ip into vm /etc/hosts and use hostname in k8s manifests?
| @@ -0,0 +1,116 @@ | |||
| # Cilium Test Suite | |||
There was a problem hiding this comment.
Shouldn't this be a part of Documentation/contributing.rst?
| killCmd := "sudo kill -9 $(pgrep cilium-agent)" | ||
| go docker.Node.Exec(cmd) | ||
| timeout := time.After(300 * time.Second) | ||
| for { |
There was a problem hiding this comment.
Both select cases are returning from the func, so for is not needed here.
| } | ||
|
|
||
| done := make(chan string, 1) | ||
| agent := func(option string) { |
There was a problem hiding this comment.
Please make the agent accept done channel as an argument so that each test is separated. Otherwise ending of one test may cause the first one to kill node from other test.
| DisableTimestamp: true, | ||
| }) | ||
|
|
||
| // var filename string = "test.log" |
There was a problem hiding this comment.
Please remove commented out code.
| var vagrant helpers.Vagrant | ||
|
|
||
| func init() { | ||
| // log.SetOutput(os.Stdout) |
There was a problem hiding this comment.
Please remove commented out code.
| @@ -0,0 +1,7 @@ | |||
| package ciliumTest | |||
There was a problem hiding this comment.
What is the purpose of this package?
ianvernon
left a comment
There was a problem hiding this comment.
These are some initial comments from what I have reviewed thus far. Given the magnitude of this patch, I didn't want to wait until I was completely done with review to provide initial comments so that the work on what I have suggested can be done in parallel with my upcoming review of what I haven't looked at yet.
There was a problem hiding this comment.
Add unit tests stage as we discussed. We will need to install Docker on the Jenkins hosts to run the unit tests in a Docker container.
There was a problem hiding this comment.
We should still be gathering logs from our tests as was done before on lines 34-37, as we need as many datapoints to debug failures as possible.
There was a problem hiding this comment.
We need more differentiation in build number per VM. Example: say if we have two builds running with the same value for BUILD_NUMBER on the Jenkins slave - there will be interactions that we are unsure of. Is there a limitation in the existing VM that is launched with cilium/Vagrantfile that made you need to use this VM instead? It's ideal to keep the # of VMs that we have to manage to a minimum.
There was a problem hiding this comment.
cc @aanm regarding this new Vagrantfile. What are your thoughts?
There was a problem hiding this comment.
Hey,
Thanks for the review, a lot of work to do, thanks! :-)
Not sure if I understand this, the network part is different per each K8S_VERSION:
server.vm.network "private_network",
ip: "192.168.36.1#{i}",
virtualbox__intnet: "cilium-k8s#{$build_number}-#{$K8S_VERSION}"
There was a problem hiding this comment.
see cilium/Vagrantfile:L72-82:
# Create unique ID for use in vboxnet name so Jenkins pipeline can have concurrent builds.
$job_name = ENV['JOB_BASE_NAME'] || "local"
$build_number = ENV['BUILD_NUMBER'] || "0"
$build_id = "#{$job_name}-#{$build_number}"
# Only create the build_id_name for Jenkins environment so that
# we can run VMs locally without having any the `build_id` in the name.
if ENV['BUILD_NUMBER'] then
$build_id_name = "-build-#{$build_id}"
end
If there are two jobs running concurrently on the same Jenkins slave for PR 1 and PR 2, each with BUILD_NUMBER=1 , then both will be part of the same virtualbox__intnetwhich we don't want. We need to have the JOB_BASE_NAME to differentiate between Jenkins jobs that might be using the same BUILD_NUMBER .
There was a problem hiding this comment.
Indentation is inconsistent here.
| return fmt.Sprintf("%s/runtime/manifests/", basePath) | ||
| } | ||
|
|
||
| //GetFullPath return the valid path for a file |
| return fmt.Sprintf("%s%s", c.ManifestsPath(), name) | ||
| } | ||
|
|
||
| //PolicyEndpointsSummary Return a map of the status of the policies |
There was a problem hiding this comment.
returns the count of whether policy enforcement is enabled, disabled, and the total number of endpoints, and an error if the Cilium endpoint metadata cannot be retrieved via the API.
| return result, nil | ||
| } | ||
|
|
||
| //PolicyEnforcementSet set policyEnforcement in endpoint |
There was a problem hiding this comment.
sets the PolicyEnforcement configuration value for the Cilium agent to the provided status.
| return res | ||
| } | ||
|
|
||
| //PolicyDel delete a given policy |
There was a problem hiding this comment.
deletes a policy with the given ID
| return rev.IntOutput() | ||
| } | ||
|
|
||
| //PolicyImport Import a new policy in cilium and wait until all endpoints |
There was a problem hiding this comment.
imports a new policy into Cilium and waits until the policy revision number increments.
| func addTestType(text string, body func(), testType string, control *AfterAll, timeout ...float64) bool { | ||
| var ginkgoFunc func(text string, body interface{}, timeout ...float64) bool | ||
|
|
||
| //FIXME: XIT functions |
There was a problem hiding this comment.
What does this mean? If this is something that can be fixed after this initial patch is merged as an enhancement, please create a follow-up GitHub issue, refer to this PR / its corresponding issue, and add GH- in this FIXME comment so we can be sure that we have an issue tracking this FIXME.
| var data []models.Endpoint | ||
| err := c.Exec(fmt.Sprintf("endpoint get %s", id)).UnMarshal(&data) | ||
| if err != nil { | ||
| c.logCxt.Infof("EndpointsGet fail %d: %s", id, err) |
There was a problem hiding this comment.
Make sure to update this log message when you rename the function (EndpointsGet --> EndpointGet).
| sh './tests/k8s/start.sh' | ||
| } | ||
| "Runtime":{ | ||
| sh 'cd ${TESTDIR}; ginkgo --focus="Run*" -v -noColor' |
There was a problem hiding this comment.
Would it be possible to get coverage results of the Cilium commands using ginkgo -cover ? Or is that not possible since we are running cilium commands via its CLI and are not actually testing Cilium code directly with Ginkgo?
There was a problem hiding this comment.
cc @aanm regarding this new Vagrantfile. What are your thoughts?
| "github.com/onsi/ginkgo" | ||
| ) | ||
|
|
||
| //AfterAll struct to run after all process finish |
There was a problem hiding this comment.
Would change to "AfterAll is a structure whose Body is called after all processes finish".
What is the scope of these processes? I.e., are these called after the suite is finished, after a test is finished, etc. - please clarify in the description / comment.
| @@ -0,0 +1,69 @@ | |||
| package k8sT | |||
| @@ -0,0 +1,119 @@ | |||
| package k8sT | |||
| ciliumPod, err := kubectl.GetCiliumPodOnNode("kube-system", "k8s1") | ||
| Expect(err).Should(BeNil()) | ||
|
|
||
| //Check that cilium detects a |
| @@ -0,0 +1,17 @@ | |||
| package k8sT | |||
There was a problem hiding this comment.
add copyright header if this file is going to be kept. I think this can be deleted.
| docker tag cilium 192.168.36.11:5000/cilium/cilium-dev | ||
| docker push 192.168.36.11:5000/cilium/cilium-dev | ||
| else | ||
| echo "No master, no need to compile" |
There was a problem hiding this comment.
This log is not clear; change to not on master K8s node; no need to compile Cilium
ianvernon
left a comment
There was a problem hiding this comment.
more comments. I still need to review the tests' content to ensure that they cover what currently exists in in the bash scripts.
| key: node-role.kubernetes.io/master | ||
| - effect: NoSchedule | ||
| key: node.cloudprovider.kubernetes.io/uninitialized | ||
| value: "true" |
There was a problem hiding this comment.
Having this many DaemonSet files to maintain will be very difficult for us, as we have to update examples/minikube, examples/kubernetes, etc. Is there a way we can just use the files located in examples/ directory instead of duplicating them here?
| Expect(endPoints["disabled"]).To(Equal(1)) | ||
| }) | ||
|
|
||
| By("Apply a new policy") |
There was a problem hiding this comment.
specify the path of the policy
| @@ -0,0 +1,652 @@ | |||
| package RunT | |||
|
|
||
| var _ = Describe("RunPolicies", func() { | ||
|
|
||
| var initilized bool |
| var docker *helpers.Docker | ||
| var cilium *helpers.Cilium | ||
|
|
||
| initilize := func() { |
| }, 500) | ||
|
|
||
| It("Service L4 tests", func() { | ||
| // createInterface(docker.Node) |
There was a problem hiding this comment.
Remove this if it is not being used.
| } | ||
| path := "/vagrant/create_veth_interface" | ||
| node.Exec("sudo ip addr add fd02:1:1:1:1:1:1:1 dev cilium_host") | ||
| //FIXME: Here we need to check if executes correctly |
There was a problem hiding this comment.
Error needs to be handled before this patch is merged.
| @@ -0,0 +1,43 @@ | |||
| [ | |||
There was a problem hiding this comment.
can you rename these policy files to correspond to the test in which they are used ?
e.g., Policies-l7-simple.json
| @@ -0,0 +1,7 @@ | |||
| package ciliumTest | |||
| @@ -0,0 +1,90 @@ | |||
| package ciliumTest | |||
ianvernon
left a comment
There was a problem hiding this comment.
Added a few more comments. We need to ensure that the coverage is equivalent to what we have in cilium/tests/k8s currently (minus the stress tests, as I think we can convert those in a later PR / in the nightly builds we proposed last week).
I think we still need to port tests/k8s/tests/01-guestbook.sh and tests/k8s/tests/02-cnp-specs.sh as well. @aanm please confirm.
|
|
||
| It("PolicyEnforcement Changes", func() { | ||
| //This is a small test that check that everything is working in k8s. Full monkey testing | ||
| // is on runtime/Policies |
| res, err := kubectl.GetPodsNames("default", fmt.Sprintf("id=%s", v)) | ||
| Expect(err).Should(BeNil()) | ||
| appPods[v] = res[0] | ||
| logger.Infof("PolicyRulesTest: pod='%s' assigned to '%s'", res[0], v) |
| }, 300) | ||
|
|
||
| //FIXME: Check service with IPV6 | ||
| //FIXME: Check the service with cross-node |
There was a problem hiding this comment.
I think this is a blocker for merging and that it is critical that we have cross-service tests as is done in tests/k8s/tests/02-cnp-specs.sh. cc: @aanm do you agree?
| var kubectl *helpers.Kubectl | ||
| var demoDSPath string | ||
| var logger *log.Entry | ||
| var initilized bool |
| var logger *log.Entry | ||
| var initilized bool | ||
|
|
||
| initilize := func() { |
|
|
||
| "github.com/cilium/cilium/test/helpers" | ||
| . "github.com/onsi/ginkgo" | ||
| . "github.com/onsi/gomega" |
| "fmt" | ||
|
|
||
| "github.com/cilium/cilium/test/helpers" | ||
| . "github.com/onsi/ginkgo" |
|
|
||
| "github.com/cilium/cilium/test/helpers" | ||
| . "github.com/onsi/ginkgo" | ||
| . "github.com/onsi/gomega" |
| "context" | ||
|
|
||
| "github.com/cilium/cilium/test/helpers" | ||
| . "github.com/onsi/ginkgo" |
|
|
||
| import ( | ||
| "github.com/cilium/cilium/test/helpers" | ||
| . "github.com/onsi/ginkgo" |
de60e7e to
684fb27
Compare
|
@eloycoto can you please rebase against master and trigger the build again? Not sure what the cause of failure was here. Looks unrelated to your changes. |
There was a problem hiding this comment.
exported var EndpointWaitUntilReadyRetry should have comment or be unexported
should drop = 0 from declaration of var EndpointWaitUntilReadyRetry; it is the zero value
There was a problem hiding this comment.
exported const MaxRetries should have comment (or a comment on this block) or be unexported
There was a problem hiding this comment.
exported var EndpointWaitUntilReadyRetry should have comment or be unexported
should drop = 0 from declaration of var EndpointWaitUntilReadyRetry; it is the zero value
There was a problem hiding this comment.
exported const MaxRetries should have comment (or a comment on this block) or be unexported
Added the migration of current test platform to the new one using Ginkgo. These changes add the test/ folder where all test_suites are defined and described. On the other hand, created a new basebox that had all cilium dependencies in place, so provision don't need to install all the deps each time that want to run the test. Updated dependencies and added Gingko, gomega and ssh_config Signed-off-by: Eloy Coto <eloy.coto@gmail.com>
As the previous commit mentions, this does not occur on main, v1.16, and v1.15 due to cilium#29036. However, in v1.14 and v1.13, we need to take special care because the NameManager and SelectorCache lock can be taken while the Endpoint lock is held during Endpoint deletion. Here are the relevant stacktraces regarding the deadlock: ``` 1: sync.Mutex.Lock [75 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*uint32(0x5), false, 43690) sync mutex.go:171 (*Mutex).lockSlow(*Mutex(cilium#1733)) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*RWMutex(0xb0)) fqdn name_manager.go:70 (*NameManager).Lock(0xffffffffffffffff) policy selectorcache.go:964 (*SelectorCache).RemoveSelectors(cilium#1088, {cilium#28569, 0xb, 1}, {cilium#643, cilium#32582}) policy l4.go:810 (*L4Filter).removeSelectors(cilium#32582, cilium#29992) policy l4.go:817 (*L4Filter).detach(cilium#719, cilium#29993) policy l4.go:988 L4PolicyMap.Detach(...) policy l4.go:1179 (*L4Policy).Detach(cilium#20318, cilium#1383) policy resolve.go:103 (*selectorPolicy).Detach(...) policy distillery.go:81 (*PolicyCache).delete(cilium#1354, cilium#19354) policy distillery.go:138 (*PolicyCache).LocalEndpointIdentityRemoved(cilium#523, cilium#1163) identitymanager manager.go:167 (*IdentityManager).remove(cilium#706, cilium#19354) identitymanager manager.go:147 (*IdentityManager).Remove(cilium#706, cilium#19354) identitymanager manager.go:52 Remove(...) endpoint endpoint.go:1146 (*Endpoint).leaveLocked(cilium#1883, cilium#12221, {0x30, 0}) endpoint endpoint.go:2192 (*Endpoint).Delete(cilium#1883, {0x80, 0xaa}) endpointmanager manager.go:380 (*EndpointManager).removeEndpoint(cilium#1161, 0, {0xff, 0xff}) endpointmanager manager.go:394 (*EndpointManager).RemoveEndpoint(...) cmd endpoint.go:684 (*Daemon).deleteEndpointQuiet(...) cmd endpoint.go:666 (*Daemon).deleteEndpoint(cilium#1155, cilium#1883) cmd endpoint.go:713 (*Daemon).DeleteEndpoint(cilium#1155, {cilium#27399, cilium#8108}) cmd endpoint.go:770 (*deleteEndpointID).Handle(cilium#21193, {cilium#2452, {cilium#27399, 0x4d}}) endpoint delete_endpoint_id.go:66 (*DeleteEndpointID).ServeHTTP(cilium#5934, {cilium#666, cilium#2242}, cilium#2452) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, cilium#2242}, cilium#2452) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#130), func{cilium#2242, 0x3}) middleware router.go:78 NewRouter.func1({cilium#666, cilium#2242}, cilium#2451) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#718), func{cilium#2242, #59}) middleware redoc.go:72 Redoc.func1({cilium#666, cilium#2242}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#4920), func{cilium#2242, #45}) middleware spec.go:46 Spec.func1({cilium#666, cilium#2242}, cilium#4921) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#10532), func{cilium#2242, cilium#23015}) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, cilium#10533}, cilium#2451) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(cilium#722, {cilium#668, cilium#10533}, cilium#4922) http server.go:2938 serverHandler.ServeHTTP(*Server(cilium#8105), cilium#668, cilium#10533, 0x6) http server.go:2009 (*conn).serve(*conn(cilium#16005), Context{cilium#673, cilium#1554}) 8: sync.Mutex.Lock [74 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*, 0x47, cilium#1154) sync mutex.go:171 (*Mutex).lockSlow(cilium#706) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*) identitymanager manager.go:99 (*IdentityManager).RemoveOldAddNew(cilium#706, 0, cilium#1154) identitymanager manager.go:123 RemoveOldAddNew(...) endpoint policy.go:852 (*Endpoint).SetIdentity(*, cilium#1154, 0) endpoint endpoint.go:1932 (*Endpoint).identityLabelsChanged(*, {cilium#674, *}, 1) endpoint endpoint.go:1780 (*Endpoint).runIdentityResolver(*, {cilium#674, *}, 1, 1) endpoint endpoint.go:1720 (*Endpoint).UpdateLabels(*, {cilium#674, *}, *, *, 8) cmd endpoint.go:477 (*Daemon).createEndpoint(cilium#1155, {cilium#673, *}, {cilium#683, cilium#1155}, *) cmd endpoint.go:542 (*putEndpointID).Handle(cilium#21192, {*, *, {*, 0xe}}) endpoint put_endpoint_id.go:58 (*PutEndpointID).ServeHTTP(cilium#3961, {cilium#666, *}, *) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware router.go:78 NewRouter.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware redoc.go:72 Redoc.func1({cilium#666, *}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, #45) middleware spec.go:46 Spec.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, *}, *) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(#49, {cilium#668, *}, *) http server.go:2938 serverHandler.ServeHTTP({cilium#653}, {cilium#668, *}, 6) http server.go:2009 (*conn).serve(*, {cilium#673, cilium#1554}) 5: sync.Mutex.Lock [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] sync sema.go:77 runtime_SemacquireMutex(cilium#142, 0xe8, *) sync mutex.go:171 (*Mutex).lockSlow(cilium#1733) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(0x68) fqdn name_manager.go:70 (*NameManager).Lock(*) policy selectorcache.go:798 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, *}, {{*, 0x4d}, {0, 0}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(*, {*, 4, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, *}, {*, 1, 1}, 0, {cilium#660, *}, {{*, 4}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, *}, #44, {*, 0xa, 0}, *, {cilium#660, *}, {{*, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, *}) api l4.go:284 PortRules.Iterate({*, 1, cilium#546}, *) policy rule.go:624 mergeEgress({cilium#680, *}, *, {*, 1, 1}, 0, {cilium#661, *}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(*, {cilium#680, *}, *, *, *, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({*, *, *}, {cilium#680, *}, *) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, *) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, *) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(*) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(*, *, *) endpoint bpf.go:589 (*Endpoint).regenerateBPF(*, *) endpoint policy.go:457 (*Endpoint).regenerate(*, *) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(*, *) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*, *) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(*) 1: select [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] semaphore semaphore.go:60 (*Weighted).Acquire(cilium#1092, {cilium#671, cilium#722}, cilium#766) lock semaphored_mutex.go:30 (*SemaphoredMutex).Lock(...) ipcache ipcache.go:140 (*IPCache).Lock(...) ipcache cidr.go:56 (*IPCache).AllocateCIDRs(cilium#1316, {0, 0, cilium#2037}, {0, 0, 0}, 0) ipcache cidr.go:103 (*IPCache).AllocateCIDRsForIPs(0, {0, cilium#697, 0}, 0xffffffffffffffff) cmd identity.go:114 cachingIdentityAllocator.AllocateCIDRsForIPs(...) policy selectorcache.go:509 (*SelectorCache).allocateIdentityMappings(cilium#1088, {{0, 0}, {cilium#5036, 0x2b}}, #45) policy selectorcache.go:843 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, cilium#27811}, {{0, 0}, {cilium#5036, 0x2b}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(cilium#27811, {cilium#42936, 0x287, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, cilium#22826}, {cilium#22828, 1, 1}, 0, {cilium#660, cilium#24510}, {{cilium#4487, 3}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, cilium#22826}, #44, {cilium#22828, 0xa, #78536}, #79633, {cilium#660, cilium#24510}, {{cilium#4487, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, cilium#24510}) api l4.go:284 PortRules.Iterate({cilium#24510, 1, cilium#546}, cilium#11741) policy rule.go:624 mergeEgress({cilium#680, cilium#22826}, cilium#18687, {cilium#22828, 1, 1}, 0, {cilium#661, cilium#5624}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(cilium#24575, {cilium#680, cilium#22826}, cilium#18687, cilium#29345, cilium#4782, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({cilium#10690, 0xb, 5}, {cilium#680, cilium#22826}, cilium#18687) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, cilium#18461) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, cilium#18461) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(cilium#1748) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(cilium#1748, cilium#27542, cilium#4781) endpoint bpf.go:589 (*Endpoint).regenerateBPF(cilium#1748, cilium#27542) endpoint policy.go:457 (*Endpoint).regenerate(cilium#1748, cilium#27542) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(cilium#18609, cilium#703) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*Once(cilium#45414), func(cilium#69)) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(cilium#6023) ``` Generated from pp tool: https://github.com/maruel/panicparse Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ cherry-picked from cilium/cilium-cli repository ] Renovate added this replace directive in #1733, and now #2041 doesn't compile because it needs a newer version of github.com/docker/docker. Let's remove this replace directive and hope everything works out. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
As the previous commit mentions, this does not occur on main, v1.16, and v1.15 due to cilium#29036. However, in v1.14 and v1.13, we need to take special care because the NameManager and SelectorCache lock can be taken while the Endpoint lock is held during Endpoint deletion. Here are the relevant stacktraces regarding the deadlock: ``` 1: sync.Mutex.Lock [75 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*uint32(0x5), false, 43690) sync mutex.go:171 (*Mutex).lockSlow(*Mutex(cilium#1733)) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*RWMutex(0xb0)) fqdn name_manager.go:70 (*NameManager).Lock(0xffffffffffffffff) policy selectorcache.go:964 (*SelectorCache).RemoveSelectors(cilium#1088, {cilium#28569, 0xb, 1}, {cilium#643, cilium#32582}) policy l4.go:810 (*L4Filter).removeSelectors(cilium#32582, cilium#29992) policy l4.go:817 (*L4Filter).detach(cilium#719, cilium#29993) policy l4.go:988 L4PolicyMap.Detach(...) policy l4.go:1179 (*L4Policy).Detach(cilium#20318, cilium#1383) policy resolve.go:103 (*selectorPolicy).Detach(...) policy distillery.go:81 (*PolicyCache).delete(cilium#1354, cilium#19354) policy distillery.go:138 (*PolicyCache).LocalEndpointIdentityRemoved(cilium#523, cilium#1163) identitymanager manager.go:167 (*IdentityManager).remove(cilium#706, cilium#19354) identitymanager manager.go:147 (*IdentityManager).Remove(cilium#706, cilium#19354) identitymanager manager.go:52 Remove(...) endpoint endpoint.go:1146 (*Endpoint).leaveLocked(cilium#1883, cilium#12221, {0x30, 0}) endpoint endpoint.go:2192 (*Endpoint).Delete(cilium#1883, {0x80, 0xaa}) endpointmanager manager.go:380 (*EndpointManager).removeEndpoint(cilium#1161, 0, {0xff, 0xff}) endpointmanager manager.go:394 (*EndpointManager).RemoveEndpoint(...) cmd endpoint.go:684 (*Daemon).deleteEndpointQuiet(...) cmd endpoint.go:666 (*Daemon).deleteEndpoint(cilium#1155, cilium#1883) cmd endpoint.go:713 (*Daemon).DeleteEndpoint(cilium#1155, {cilium#27399, cilium#8108}) cmd endpoint.go:770 (*deleteEndpointID).Handle(cilium#21193, {cilium#2452, {cilium#27399, 0x4d}}) endpoint delete_endpoint_id.go:66 (*DeleteEndpointID).ServeHTTP(cilium#5934, {cilium#666, cilium#2242}, cilium#2452) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, cilium#2242}, cilium#2452) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#130), func{cilium#2242, 0x3}) middleware router.go:78 NewRouter.func1({cilium#666, cilium#2242}, cilium#2451) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#718), func{cilium#2242, #59}) middleware redoc.go:72 Redoc.func1({cilium#666, cilium#2242}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#4920), func{cilium#2242, #45}) middleware spec.go:46 Spec.func1({cilium#666, cilium#2242}, cilium#4921) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#10532), func{cilium#2242, cilium#23015}) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, cilium#10533}, cilium#2451) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(cilium#722, {cilium#668, cilium#10533}, cilium#4922) http server.go:2938 serverHandler.ServeHTTP(*Server(cilium#8105), cilium#668, cilium#10533, 0x6) http server.go:2009 (*conn).serve(*conn(cilium#16005), Context{cilium#673, cilium#1554}) 8: sync.Mutex.Lock [74 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*, 0x47, cilium#1154) sync mutex.go:171 (*Mutex).lockSlow(cilium#706) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*) identitymanager manager.go:99 (*IdentityManager).RemoveOldAddNew(cilium#706, 0, cilium#1154) identitymanager manager.go:123 RemoveOldAddNew(...) endpoint policy.go:852 (*Endpoint).SetIdentity(*, cilium#1154, 0) endpoint endpoint.go:1932 (*Endpoint).identityLabelsChanged(*, {cilium#674, *}, 1) endpoint endpoint.go:1780 (*Endpoint).runIdentityResolver(*, {cilium#674, *}, 1, 1) endpoint endpoint.go:1720 (*Endpoint).UpdateLabels(*, {cilium#674, *}, *, *, 8) cmd endpoint.go:477 (*Daemon).createEndpoint(cilium#1155, {cilium#673, *}, {cilium#683, cilium#1155}, *) cmd endpoint.go:542 (*putEndpointID).Handle(cilium#21192, {*, *, {*, 0xe}}) endpoint put_endpoint_id.go:58 (*PutEndpointID).ServeHTTP(cilium#3961, {cilium#666, *}, *) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware router.go:78 NewRouter.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware redoc.go:72 Redoc.func1({cilium#666, *}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, #45) middleware spec.go:46 Spec.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, *}, *) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(#49, {cilium#668, *}, *) http server.go:2938 serverHandler.ServeHTTP({cilium#653}, {cilium#668, *}, 6) http server.go:2009 (*conn).serve(*, {cilium#673, cilium#1554}) 5: sync.Mutex.Lock [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] sync sema.go:77 runtime_SemacquireMutex(cilium#142, 0xe8, *) sync mutex.go:171 (*Mutex).lockSlow(cilium#1733) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(0x68) fqdn name_manager.go:70 (*NameManager).Lock(*) policy selectorcache.go:798 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, *}, {{*, 0x4d}, {0, 0}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(*, {*, 4, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, *}, {*, 1, 1}, 0, {cilium#660, *}, {{*, 4}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, *}, #44, {*, 0xa, 0}, *, {cilium#660, *}, {{*, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, *}) api l4.go:284 PortRules.Iterate({*, 1, cilium#546}, *) policy rule.go:624 mergeEgress({cilium#680, *}, *, {*, 1, 1}, 0, {cilium#661, *}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(*, {cilium#680, *}, *, *, *, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({*, *, *}, {cilium#680, *}, *) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, *) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, *) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(*) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(*, *, *) endpoint bpf.go:589 (*Endpoint).regenerateBPF(*, *) endpoint policy.go:457 (*Endpoint).regenerate(*, *) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(*, *) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*, *) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(*) 1: select [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] semaphore semaphore.go:60 (*Weighted).Acquire(cilium#1092, {cilium#671, cilium#722}, cilium#766) lock semaphored_mutex.go:30 (*SemaphoredMutex).Lock(...) ipcache ipcache.go:140 (*IPCache).Lock(...) ipcache cidr.go:56 (*IPCache).AllocateCIDRs(cilium#1316, {0, 0, cilium#2037}, {0, 0, 0}, 0) ipcache cidr.go:103 (*IPCache).AllocateCIDRsForIPs(0, {0, cilium#697, 0}, 0xffffffffffffffff) cmd identity.go:114 cachingIdentityAllocator.AllocateCIDRsForIPs(...) policy selectorcache.go:509 (*SelectorCache).allocateIdentityMappings(cilium#1088, {{0, 0}, {cilium#5036, 0x2b}}, #45) policy selectorcache.go:843 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, cilium#27811}, {{0, 0}, {cilium#5036, 0x2b}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(cilium#27811, {cilium#42936, 0x287, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, cilium#22826}, {cilium#22828, 1, 1}, 0, {cilium#660, cilium#24510}, {{cilium#4487, 3}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, cilium#22826}, #44, {cilium#22828, 0xa, #78536}, #79633, {cilium#660, cilium#24510}, {{cilium#4487, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, cilium#24510}) api l4.go:284 PortRules.Iterate({cilium#24510, 1, cilium#546}, cilium#11741) policy rule.go:624 mergeEgress({cilium#680, cilium#22826}, cilium#18687, {cilium#22828, 1, 1}, 0, {cilium#661, cilium#5624}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(cilium#24575, {cilium#680, cilium#22826}, cilium#18687, cilium#29345, cilium#4782, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({cilium#10690, 0xb, 5}, {cilium#680, cilium#22826}, cilium#18687) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, cilium#18461) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, cilium#18461) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(cilium#1748) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(cilium#1748, cilium#27542, cilium#4781) endpoint bpf.go:589 (*Endpoint).regenerateBPF(cilium#1748, cilium#27542) endpoint policy.go:457 (*Endpoint).regenerate(cilium#1748, cilium#27542) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(cilium#18609, cilium#703) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*Once(cilium#45414), func(cilium#69)) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(cilium#6023) ``` Generated from pp tool: https://github.com/maruel/panicparse Signed-off-by: Chris Tarazi <chris@isovalent.com>
As the previous commit mentions, this does not occur on main, v1.16, and v1.15 due to cilium#29036. However, in v1.14 and v1.13, we need to take special care because the NameManager and SelectorCache lock can be taken while the Endpoint lock is held during Endpoint deletion. Here are the relevant stacktraces regarding the deadlock: ``` 1: sync.Mutex.Lock [75 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*uint32(0x5), false, 43690) sync mutex.go:171 (*Mutex).lockSlow(*Mutex(cilium#1733)) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*RWMutex(0xb0)) fqdn name_manager.go:70 (*NameManager).Lock(0xffffffffffffffff) policy selectorcache.go:964 (*SelectorCache).RemoveSelectors(cilium#1088, {cilium#28569, 0xb, 1}, {cilium#643, cilium#32582}) policy l4.go:810 (*L4Filter).removeSelectors(cilium#32582, cilium#29992) policy l4.go:817 (*L4Filter).detach(cilium#719, cilium#29993) policy l4.go:988 L4PolicyMap.Detach(...) policy l4.go:1179 (*L4Policy).Detach(cilium#20318, cilium#1383) policy resolve.go:103 (*selectorPolicy).Detach(...) policy distillery.go:81 (*PolicyCache).delete(cilium#1354, cilium#19354) policy distillery.go:138 (*PolicyCache).LocalEndpointIdentityRemoved(cilium#523, cilium#1163) identitymanager manager.go:167 (*IdentityManager).remove(cilium#706, cilium#19354) identitymanager manager.go:147 (*IdentityManager).Remove(cilium#706, cilium#19354) identitymanager manager.go:52 Remove(...) endpoint endpoint.go:1146 (*Endpoint).leaveLocked(cilium#1883, cilium#12221, {0x30, 0}) endpoint endpoint.go:2192 (*Endpoint).Delete(cilium#1883, {0x80, 0xaa}) endpointmanager manager.go:380 (*EndpointManager).removeEndpoint(cilium#1161, 0, {0xff, 0xff}) endpointmanager manager.go:394 (*EndpointManager).RemoveEndpoint(...) cmd endpoint.go:684 (*Daemon).deleteEndpointQuiet(...) cmd endpoint.go:666 (*Daemon).deleteEndpoint(cilium#1155, cilium#1883) cmd endpoint.go:713 (*Daemon).DeleteEndpoint(cilium#1155, {cilium#27399, cilium#8108}) cmd endpoint.go:770 (*deleteEndpointID).Handle(cilium#21193, {cilium#2452, {cilium#27399, 0x4d}}) endpoint delete_endpoint_id.go:66 (*DeleteEndpointID).ServeHTTP(cilium#5934, {cilium#666, cilium#2242}, cilium#2452) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, cilium#2242}, cilium#2452) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#130), func{cilium#2242, 0x3}) middleware router.go:78 NewRouter.func1({cilium#666, cilium#2242}, cilium#2451) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#718), func{cilium#2242, #59}) middleware redoc.go:72 Redoc.func1({cilium#666, cilium#2242}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#4920), func{cilium#2242, #45}) middleware spec.go:46 Spec.func1({cilium#666, cilium#2242}, cilium#4921) http server.go:2136 HandlerFunc.ServeHTTP(ReadCloser(cilium#10532), func{cilium#2242, cilium#23015}) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, cilium#10533}, cilium#2451) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(cilium#722, {cilium#668, cilium#10533}, cilium#4922) http server.go:2938 serverHandler.ServeHTTP(*Server(cilium#8105), cilium#668, cilium#10533, 0x6) http server.go:2009 (*conn).serve(*conn(cilium#16005), Context{cilium#673, cilium#1554}) 8: sync.Mutex.Lock [74 minutes] [Created by http.(*Server).Serve in goroutine 1699 @ server.go:3086] sync sema.go:77 runtime_SemacquireMutex(*, 0x47, cilium#1154) sync mutex.go:171 (*Mutex).lockSlow(cilium#706) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(*) identitymanager manager.go:99 (*IdentityManager).RemoveOldAddNew(cilium#706, 0, cilium#1154) identitymanager manager.go:123 RemoveOldAddNew(...) endpoint policy.go:852 (*Endpoint).SetIdentity(*, cilium#1154, 0) endpoint endpoint.go:1932 (*Endpoint).identityLabelsChanged(*, {cilium#674, *}, 1) endpoint endpoint.go:1780 (*Endpoint).runIdentityResolver(*, {cilium#674, *}, 1, 1) endpoint endpoint.go:1720 (*Endpoint).UpdateLabels(*, {cilium#674, *}, *, *, 8) cmd endpoint.go:477 (*Daemon).createEndpoint(cilium#1155, {cilium#673, *}, {cilium#683, cilium#1155}, *) cmd endpoint.go:542 (*putEndpointID).Handle(cilium#21192, {*, *, {*, 0xe}}) endpoint put_endpoint_id.go:58 (*PutEndpointID).ServeHTTP(cilium#3961, {cilium#666, *}, *) middleware operation.go:28 (*Context).RoutesHandler.NewOperationExecutor.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware router.go:78 NewRouter.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) middleware redoc.go:72 Redoc.func1({cilium#666, *}, cilium#1251) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, #45) middleware spec.go:46 Spec.func1({cilium#666, *}, *) http server.go:2136 HandlerFunc.ServeHTTP(*, {cilium#666, *}, *) metrics middleware.go:64 (*APIEventTSHelper).ServeHTTP(cilium#1459, {cilium#668, *}, *) api apipanic.go:42 (*APIPanicHandler).ServeHTTP(#49, {cilium#668, *}, *) http server.go:2938 serverHandler.ServeHTTP({cilium#653}, {cilium#668, *}, 6) http server.go:2009 (*conn).serve(*, {cilium#673, cilium#1554}) 5: sync.Mutex.Lock [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] sync sema.go:77 runtime_SemacquireMutex(cilium#142, 0xe8, *) sync mutex.go:171 (*Mutex).lockSlow(cilium#1733) sync mutex.go:90 (*Mutex).Lock(...) sync rwmutex.go:147 (*RWMutex).Lock(0x68) fqdn name_manager.go:70 (*NameManager).Lock(*) policy selectorcache.go:798 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, *}, {{*, 0x4d}, {0, 0}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(*, {*, 4, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, *}, {*, 1, 1}, 0, {cilium#660, *}, {{*, 4}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, *}, #44, {*, 0xa, 0}, *, {cilium#660, *}, {{*, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, *}) api l4.go:284 PortRules.Iterate({*, 1, cilium#546}, *) policy rule.go:624 mergeEgress({cilium#680, *}, *, {*, 1, 1}, 0, {cilium#661, *}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(*, {cilium#680, *}, *, *, *, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({*, *, *}, {cilium#680, *}, *) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, *) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, *) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(*) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(*, *, *) endpoint bpf.go:589 (*Endpoint).regenerateBPF(*, *) endpoint policy.go:457 (*Endpoint).regenerate(*, *) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(*, *) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*, *) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(*) 1: select [75 minutes] [Created by eventqueue.(*EventQueue).Run in goroutine 1482 @ eventqueue.go:229] semaphore semaphore.go:60 (*Weighted).Acquire(cilium#1092, {cilium#671, cilium#722}, cilium#766) lock semaphored_mutex.go:30 (*SemaphoredMutex).Lock(...) ipcache ipcache.go:140 (*IPCache).Lock(...) ipcache cidr.go:56 (*IPCache).AllocateCIDRs(cilium#1316, {0, 0, cilium#2037}, {0, 0, 0}, 0) ipcache cidr.go:103 (*IPCache).AllocateCIDRsForIPs(0, {0, cilium#697, 0}, 0xffffffffffffffff) cmd identity.go:114 cachingIdentityAllocator.AllocateCIDRsForIPs(...) policy selectorcache.go:509 (*SelectorCache).allocateIdentityMappings(cilium#1088, {{0, 0}, {cilium#5036, 0x2b}}, #45) policy selectorcache.go:843 (*SelectorCache).AddFQDNSelector(cilium#1088, {cilium#643, cilium#27811}, {{0, 0}, {cilium#5036, 0x2b}}) policy l4.go:628 (*L4Filter).cacheFQDNSelector(...) policy l4.go:623 (*L4Filter).cacheFQDNSelectors(cilium#27811, {cilium#42936, 0x287, cilium#193}, cilium#536) policy l4.go:725 createL4Filter({cilium#680, cilium#22826}, {cilium#22828, 1, 1}, 0, {cilium#660, cilium#24510}, {{cilium#4487, 3}, ...}, ...) policy l4.go:879 createL4EgressFilter(...) policy rule.go:717 mergeEgressPortProto({cilium#680, cilium#22826}, #44, {cilium#22828, 0xa, #78536}, #79633, {cilium#660, cilium#24510}, {{cilium#4487, ...}, ...}, ...) policy rule.go:672 mergeEgress.func1({cilium#660, cilium#24510}) api l4.go:284 PortRules.Iterate({cilium#24510, 1, cilium#546}, cilium#11741) policy rule.go:624 mergeEgress({cilium#680, cilium#22826}, cilium#18687, {cilium#22828, 1, 1}, 0, {cilium#661, cilium#5624}, {cilium#662, ...}, ...) policy rule.go:753 (*rule).resolveEgressPolicy(cilium#24575, {cilium#680, cilium#22826}, cilium#18687, cilium#29345, cilium#4782, {0, 0, 0}, {0, ...}) policy rules.go:103 ruleSlice.resolveL4EgressPolicy({cilium#10690, 0xb, 5}, {cilium#680, cilium#22826}, cilium#18687) policy repository.go:718 (*Repository).resolvePolicyLocked(cilium#1089, cilium#18461) policy distillery.go:119 (*PolicyCache).updateSelectorPolicy(cilium#1354, cilium#18461) policy distillery.go:153 (*PolicyCache).UpdatePolicy(...) endpoint policy.go:262 (*Endpoint).regeneratePolicy(cilium#1748) endpoint bpf.go:744 (*Endpoint).runPreCompilationSteps(cilium#1748, cilium#27542, cilium#4781) endpoint bpf.go:589 (*Endpoint).regenerateBPF(cilium#1748, cilium#27542) endpoint policy.go:457 (*Endpoint).regenerate(cilium#1748, cilium#27542) endpoint events.go:53 (*EndpointRegenerationEvent).Handle(cilium#18609, cilium#703) eventqueue eventqueue.go:245 (*EventQueue).run.func1() sync once.go:74 (*Once).doSlow(*Once(cilium#45414), func(cilium#69)) sync once.go:65 (*Once).Do(...) eventqueue eventqueue.go:233 (*EventQueue).run(cilium#6023) ``` Generated from pp tool: https://github.com/maruel/panicparse Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ cherry-picked from cilium/cilium-cli repository ] Renovate added this replace directive in #1733, and now #2041 doesn't compile because it needs a newer version of github.com/docker/docker. Let's remove this replace directive and hope everything works out. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Hello!
This PR is still WIP, at the moment all basic test should be in there, but I'll like to add the following test before get it merged:
On the other hand, I didn't see any test related with the option
-lband maybe it's needed to add a test for that.Related with other work to do:
Please, can you have a look and make your notes/suggestions in this PR?
Regards