Fixing a couple of network plugin life-cycle mgmt issues#29556
Fixing a couple of network plugin life-cycle mgmt issues#29556vieux merged 3 commits intomoby:masterfrom
Conversation
|
Please include 151d1cf in this PR, since it is refcount related. If you are not inclined to, I can make a separate pr for that. |
|
@anusha-ragunathan sure. picked up your change as well. With this we will also have |
daemon/network.go
Outdated
There was a problem hiding this comment.
I'd suggest making this logrus.WithError(err).WithFields(logrus.Fields{"mode": mode, "driver", driver}).Error("Error handling plugin refcount operation")
That way the error will not be hidden, and the mode and driver information will be captured in a more structured way.
daemon/network.go
Outdated
There was a problem hiding this comment.
Outside the scope of this PR: all caps declarations are unusual in Go and clash with the typical style. I opened #29564 to change this. This PR can be merged first and then I'll rebase afterwards.
|
@anusha-ragunathan Inclusion of your patch causes |
|
@mavenugo : I will open a new PR for moving refcount from remove to disable. |
|
DockerSwarmSuite.TestAPIServiceUpdatePort panic |
|
@anusha-ragunathan the test is timing out waiting for |
|
Daemon fails to start. |
|
The panic is the same as #27287 (comment). |
|
Confirmed that GetAll is blocking. |
|
@mavenugo : Rootcaused the issue. I've fixed it by cleaning up the Scan directories before calling the test. This is a pointed fix. The long term fix would be to add this in the TearDownTest and TearDownSuite sections of both DockerSuite and DockerSwarmSuite. Fix available at |
|
@anusha-ragunathan I agree. We need a better fix in the test infra to cleanup stale entries. I picked up your commit and started the build again. Lets see if this fixes the test issue. |
|
@anusha-ragunathan same failures now seen on other plugin tests. I guess we have to fix this in a more generic fashion rather than a point fix at this point :( |
|
@anusha-ragunathan I added a temporary patch : 4b710fe to skip plugin-v1 from being scanned. There is only 1 place the GetAllByCap is used today and hence this is the least impactful change for now. Once the tests are done, we can decide on what can be done... |
|
Just as a data-point, skipping plugin-v1 scan using 4b710fe solves the problem. It may not be an ideal fix. But atleast we know that the test-case is failing because of scanning (and failing due to the absence of) v1-plugin |
|
@anusha-ragunathan another attempt and this time cleaning up a v1-plugin immediately after use in one of the swarm network tests (1024517). If this works, then we are good. |
|
The approach of getting v2 plugins only in this case, sounds okay to me. Design LGTM. |
|
@anusha-ragunathan updated with the new vendoring changes. fingers crossed on IT :) |
|
The CI failures in |
|
@anusha-ragunathan @tonistiigi @tiborvass could you PTAL ? the CI issues are unrelated. |
|
SGTM @mavenugo Needs rebase. Use |
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
|
@tonistiigi done. Its green now ! |
|
LGTM @mavenugo would you mind opening this PR against 1.13 branch as well ? Not sure about the vendoring part. |
|
@tiborvass yes. I will once this is merged in the master. |
|
LGTM |
|
@mavenugo done |
|
@vieux @tiborvass cherry-picked : #29858 |
Fixes #29447 : Reference count is managed for network plugins using ACQUIRE and RELEASE mechanism in createNetwork and deleteNetwork.
Fixes #29471 : libnetwork remote driver init is modified to lookup active plugins and registered with drvRegistry
Fixes #29473 : Since plugin rm is not fully propagated to drvRegistry, we allowed overriding the drvRegistry map to cleanup stale plugin socket