Skip to content

Proposal: support Domain as a first-class field #7

@jdef

Description

@jdef

Volume locality is important for both COs and SPs:

  • COs may strategically place volumes to optimize workloads for throughput, availability, etc.
  • SPs may strategically place volumes to optimize (especially when guidance is not provided by a CO).
    • The fault domain(s) of the backend storage may not align with the fault domain(s) of the node(s) to which volumes are attached. Furthermore, placing 100% of the volume placement burden on the CO may be problematic: the domain topology of the backend storage may be sufficiently complex as to significantly complicate placement logic in the CO (and lowers the chances that all CO's will properly implement such complex placement strategies).
    • There is very likely a need to separate the concept of origin domain (where the volume lives within backend storage) from target domain (where the volume is attached on the cluster)
    • As per CreateVolumeRequest - support for affinity/anti-affinity with other volumes #44 CSI should consider APIs that allow CO's to express, at a high level, volume placement concerns with respect to the origin domain. Perhaps via affinity/anti-affinity volume constraints. It's likely that such APIs will not be supported by all plugins; plugins should probably opt-in to the API via capabilities.
  • A pre-created volume's location may determine it's viability for specific workloads
  • ... and probably other reasons.

One way to represent volume locality is by "fault domain". Below are some options for representing this concept in CSI:

// Option 1:
message Domain {
  message Zone {
    string value;
  }
  message Region {
    string value;
  }
  // TODO: add NODE_ID
  Zone zone; // OPTIONAL
  Region region; // OPTIONAL
}

// Option 2: Enforced domain hierarchy
message Domain {
  message Region {
    required string value
  }
  message Zone {
    required string value
    Region region
  }
  message Node {
    required NodeID node_id
    oneof parent {
      Zone zone
      Region region
    }
  }
  oneof value {
    Region	region
    Zone	zone
    Node	node
  }
}

// Option 3: Support variable topologies.
// - Can topologies be described in capabilities?
// - How do CO’s understand/reason about variable topologies?
// - How might a CO cope with structural topology changes over time?
message Domain {
  required string unit = 1;	// for example “zone”
  string value = 2;
  parent Domain = 3;
}

Some additional questions to consider:

  • Should the Node service report the Domain of the node, if any?
  • What is the relationship between capacity and domain?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions