Skip to content

Add Swarm Cluster Volume support#41982

Merged
thaJeztah merged 1 commit intomoby:masterfrom
dperny:feature-volumes
May 13, 2022
Merged

Add Swarm Cluster Volume support#41982
thaJeztah merged 1 commit intomoby:masterfrom
dperny:feature-volumes

Conversation

@dperny
Copy link
Copy Markdown
Contributor

@dperny dperny commented Feb 4, 2021

This is a preliminary, draft pull request for Swarm Cluster Volume Support. Though it is a long way off from being ready for merging, because of the size and scope of it, it is best to start review now.

This is more-or-less an implementation of the proposal in #39624.

Please give special attention to the API changes. I welcome any alternative suggestions for how to structure the API.

Closes #31923
Closes #39624

@thaJeztah
Copy link
Copy Markdown
Member

Related swarm kit branch (not fully up to date); https://github.com/docker/swarmkit/compare/feature-volumes

@olljanat
Copy link
Copy Markdown
Contributor

@dperny can you share instruction how to test this with some CSI plugin?

@chrisbecke
Copy link
Copy Markdown

no updates in 7mo. Ouch.

@dperny dperny force-pushed the feature-volumes branch 5 times, most recently from 0c67329 to 383cb01 Compare September 27, 2021 14:46
@dperny dperny changed the title WIP: Add Swarm Cluster Volume support Add Swarm Cluster Volume support Sep 28, 2021
@dperny
Copy link
Copy Markdown
Contributor Author

dperny commented Sep 28, 2021

I'm unclear why these integration tests are failing.

@dperny
Copy link
Copy Markdown
Contributor Author

dperny commented Sep 28, 2021

This PR is ready for review. There is one major outstanding issue:

Currently, there is no checking to ensure there is no duplication between local and cluster volume names. That is, if a local volume already exists with a given name, a cluster volume can be created with the same name, and visa versa.

This problem is not an easy fix. If we simply check to see if a volume already exists, we run the risk of a race. We cannot mitigate this with locking, because the nature of swarm and cluster volumes means we would have to acquire a lock on the cluster to avoid other nodes making API calls to create cluster volumes.

Otherwise, this should be the first final form of this PR, finally.

@mhemrg
Copy link
Copy Markdown

mhemrg commented Sep 29, 2021

To mount a cluster volume, do we need to specify the scope? Something like: name=the-vol, scope=cluster

@chrisbecke
Copy link
Copy Markdown

chrisbecke commented Oct 1, 2021

Now i'm going to have to navigate the little issue of, all the CSI drivers of interest to me come with helm charts, and or instructions for installation on K8s.
Can't say i'm looking forward to figuring that hurdle out.

@dperny
Copy link
Copy Markdown
Contributor Author

dperny commented Oct 14, 2021

@chrisbecke This is actually, funny enough, also my problem. Basically, I've had to extract binaries from the images used by Kube or build those binaries from scratch and then build a Swarm CSI compatible Docker Plugin configuration. It's completely possible and doesn't require any changes in the CSI drivers, but I do have to start over from just the binaries.

It doesn't help that many of these binaries lack documentation of what the actual binary does and what arguments it takes, as opposed to the args taken by the helm charts.

On the bright side, I already work scoped out for documenting this process for plugin developers and super users.

"gotest.tools/v3/assert"
)

// TODO(dperny): write tests
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!

@thaJeztah thaJeztah added this to the 21.xx milestone Mar 17, 2022
@dperny dperny force-pushed the feature-volumes branch from d05f84e to d4afb8d Compare May 11, 2022 13:56
Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost there (I think!) found some issues / suggestions; let me push one more PR against your branch

Comment on lines +27 to +34
type ClusterBackend interface {
GetVolume(nameOrID string) (volume.Volume, error)
GetVolumes(options volume.ListOptions) ([]*volume.Volume, error)
CreateVolume(volume volume.CreateOptions) (*volume.Volume, error)
RemoveVolume(nameOrID string, force bool) error
UpdateVolume(nameOrID string, version uint64, volume volume.UpdateOptions) error
IsManager() bool
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a follow-up, perhaps we should look if we can unify the interface with the (non-cluster) Backend above; there appears to be an overlap (but slightly different signatures); if we unify both, possibly ClusterBackend could be Backend plus <cluster-specific methods>, like;

type ClusterBackend interface {
    Backend
    Update(nameOrID string, version uint64, volume volume.UpdateOptions) error
    IsManager() bool    
}

Not a show-stopper though; the ClusterBackend is considered an internal interface (only to decouple internal code from the API), so we should be able to make (breaking) changes in future.

const (
// clusterVolumesVersion defines the API version that swarm cluster volume
// functionality was introduced. avoids the use of magic numbers.
clusterVolumesVersion = "1.42"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was eyeing this const for a bit (was wondering if just putting 1.42 inline everywhere would be equally descriptive without having to look up the value of the const, but it's used in quite some places, so I guess it makes sense to have a const for this.

Comment on lines +403 to +405
// CapacityBytes is the capacity of the volume in bytes. A value of 0
// indicates that the capacity is unknown.
CapacityBytes int `json:",omitempty"`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at this, and initially wondering

  • "should this be an uint / uint64 if we don't use negative values
  • would be consistent with CapacityRange (which uses uint64)
  • ^^ could this be an issue? (are CapacityRange and CapacityBytes compared with each-other?)

it looks like the SwarmKit API type does use an int64, so changing this type to the same would prevent having to do some type-juggling;

// CapacityBytes is the capacity of this volume in bytes. A value of 0
// indicates that the capcity is unknown.
CapacityBytes int64 `protobuf:"varint,1,opt,name=capacity_bytes,json=capacityBytes,proto3" json:"capacity_bytes,omitempty"`

Looking slightly further, it looks like the SwarmKit API does do some conversion to an uint64;

func (m *VolumeInfo) Size() (n int) {
if m == nil {
return 0
}
var l int
_ = l
if m.CapacityBytes != 0 {
n += 1 + sovTypes(uint64(m.CapacityBytes))
}

And looking even further, it seems that for CapacityRange we do the exact oposite; the Swarm API uses int64 but the moby type here uses uint64;

type CapacityRange struct {
// RequiredBytes specifies that a volume must be at least this big. The value
// of 0 indicates an unspecified minimum. Must not be negative.
RequiredBytes int64 `protobuf:"varint,1,opt,name=required_bytes,json=requiredBytes,proto3" json:"required_bytes,omitempty"`
// LimitBytes specifies that a volume must not be bigger than this. The value
// of 0 indicates an unspecified maximum. Must not be negative.
LimitBytes int64 `protobuf:"varint,2,opt,name=limit_bytes,json=limitBytes,proto3" json:"limit_bytes,omitempty"`
}

Perhaps we should align them, so that we don't have to do the type-juggling on conversion 🤔

@thaJeztah
Copy link
Copy Markdown
Member

Opened dperny#2

@dperny dperny force-pushed the feature-volumes branch from eb24b80 to a6e8e9e Compare May 12, 2022 14:06
@thaJeztah
Copy link
Copy Markdown
Member

As to #41982 (comment) above (uint64 vs int64), I had a chat with @dperny about that, and it came down to;

  • the CSI spec uses an int64 but does not allow negative values
  • the SwarmKit API followed the int64 to match that
  • but the Docker API used an uint64 so that negative values would produce an error

The error that would be returned wouldn't be very "friendly" though;

json: cannot unmarshal number -1 into Go struct field CapacityRange.RequiredBytes of type uint64

I see @dperny merged my PR dperny#2, so now the Docker API uses int64. Some things should be looked into (can be a follow-up);

As we no longer can depend on json.Unmarshal() to produce an error, we should validate ourself (check if negative values were provided). Alternatively, we could change both the SwarmKit API and the Docker API to use uint64

I'm curious though;

  • what values the CSI spec allows (as using a uint64 means the full range of CapacityRange.RequiredBytes and CapacityRange.LimitBytes cannot be used (only "half")
  • Does the CSI spec (or SwarmKit code) at any point deal with both Info.CapacityBytes and CapacityRange ? (and what type does CSI use for this? i.e., could there be an integer overflow somewhere?

@cpuguy83
Copy link
Copy Markdown
Member

Maybe someone can put together a plugin (even if it's just for demo purposes) and push to dockerhub so everyone can give it a try?

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Adds code to support Cluster Volumes in Swarm using CSI drivers.

Signed-off-by: Drew Erny <derny@mirantis.com>
@thaJeztah
Copy link
Copy Markdown
Member

CI is green, but there's a merge conflict now in version-history.md, due to #39812 being merged (saw that one coming); let me do a quick rebase

@thaJeztah thaJeztah merged commit d35731f into moby:master May 13, 2022
@prologic
Copy link
Copy Markdown
Contributor

Very nice! 🥳

#### Force-Removing Volumes

There are cases where a Volume can get caught in a state where Swarm cannot
verify their removal. In these cases,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw that this sentence / paragraph is incomplete.

@chrisbecke
Copy link
Copy Markdown

Maybe someone can put together a plugin (even if it's just for demo purposes) and push to dockerhub so everyone can give it a try?

At the risk of being completely off-topic - a functional plugin that worked would be amazing. There is little as perplexing to new adopters of clustered containerization as the lack of persistent volume support and having a default plugin that provided clustered persistent volumes working in an expected way would alleviate a lot of the confusion. And, perhaps ensure that in 3 years time, Mirantis extend their support for our favorite orchestrator yet further...

@foureight84
Copy link
Copy Markdown

Maybe someone can put together a plugin (even if it's just for demo purposes) and push to dockerhub so everyone can give it a try?

At the risk of being completely off-topic - a functional plugin that worked would be amazing. There is little as perplexing to new adopters of clustered containerization as the lack of persistent volume support and having a default plugin that provided clustered persistent volumes working in an expected way would alleviate a lot of the confusion. And, perhaps ensure that in 3 years time, Mirantis extend their support for our favorite orchestrator yet further...

I've been trying to repackage the seaweedfs csi plugin for k8s to try out but I've been running into issues getting the plugin to connect to the running seaweedfs stack. An example would be great.

@s4ke
Copy link
Copy Markdown
Contributor

s4ke commented Jun 16, 2022

Weighing in here as well. I am looking into repackaging https://github.com/hetznercloud/csi-driver for Docker, but right now I am lacking a place to start - Mostly a quick pointer as to how to load a plugin especially since 22.06 is in beta right now? I think that would already go a long way.

@prologic
Copy link
Copy Markdown
Contributor

Is anyone able to provide documentation for the community on repackaging CSI drivers for Docker / Docker Swarm? 🤔

@olljanat
Copy link
Copy Markdown
Contributor

@s4ke afaiu PR which you want follow is docker/cli#3662 Hopefully it gets included to next beta version.

@tianon
Copy link
Copy Markdown
Member

tianon commented Jun 29, 2022

See #43754 for more details about creating a plugin! 🎉
(Thanks @dperny ❤️)

Review/testing over there would be helpful 👀

@olljanat
Copy link
Copy Markdown
Contributor

olljanat commented Feb 5, 2023

FYI. I have build some example scripts and guidance to https://github.com/olljanat/csi-plugins-for-docker-swarm about how to make existing CSI plugins working with Swarm which those who following this discussion might find useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Draft Proposal] Swarm Cluster Volume Support with CSI Plugins Proposal: Container Storage Interface (CSI) proposed standard