Skip to content

Refactor e2e test infrastructure — introduce domain packages, shared clients, and new TestContext #513

@oleg-kushniriov

Description

@oleg-kushniriov

Restructure the e2e test framework from a flat layout into domain-oriented packages
with shared Kubernetes clients and a simplified test setup API.

The domain-oriented structure groups code by what it operates on rather than
by when it was written. Kubernetes primitives (pods, nodes, resources) live
in k8s/, Grove-specific concepts (workloads, topology, pod groups) live
in grove/, and failure diagnostics live in diagnostics/. This makes it
obvious where new code belongs, keeps each package small and focused, and
allows tests to import only the domains they need. Managers encapsulate
the shared *clients.Clients once, so tests never handle raw client fields —
they work with domain operations (ScalePCS, VerifyTopology, WaitForPods)
instead of assembling clientset calls inline.

New package structure:

  • k8s/clients/Clients struct, NewClients (single entry point for all K8s clients)
  • k8s/pods/PodManager, CountPodsByPhase, CountReady
  • k8s/nodes/NodeManager, IsReady
  • k8s/resources/ResourceManager, AppliedResource
  • k8s/ (parent) — PollForCondition, conversion helpers (shared utilities)
  • grove/workload/WorkloadManager
  • grove/topology/TopologyVerifier
  • grove/podgroup/PodGroupVerifier, ExpectedSubGroup
  • grove/config/OperatorConfig, GroveMetadata
  • diagnostics/DiagCollector (extracted from old debug_utils.go)
  • tests/context.goTestContext, PrepareTest(), convenience methods

Removed:

  • LegacyTestContext struct and all 24+ methods on it
  • PrepareTestCluster, clientCollection, clientCollectionFromClients
  • Legacy diagnostics methods from debug_utils.go
  • auto-mnnvl's local testContext / prepareTest (replaced by shared TestContext)

Migrated test suites:

  • gang scheduling, startup ordering, topology, scale, cert management,
    update (rolling/ondelete), auto-mnnvl (4 suites)

Before:

clients, cleanup := PrepareTestCluster(ctx, t, 10)
tc := TestContext{
    Clientset:     clients.Clientset,
    DynamicClient: clients.DynamicClient,
    RestConfig:    clients.RestConfig,
    Namespace:     "default",
    Timeout:       DefaultPollTimeout,
    Interval:      DefaultPollInterval,
    Workload:      &WorkloadConfig{...},
}
nodesToCordon := setupAndCordonNodes(tc, 1)
_, err := DeployAndVerifyWorkload(tc)

After:

  tc, cleanup := PrepareTest(ctx, t, 10,
      WithWorkload(&WorkloadConfig{...}),
  )
  nodesToCordon := tc.SetupAndCordonNodes(1)
  _, err := tc.DeployAndVerifyWorkload()

Tests that need domain-specific verifiers create them locally:
topologyVerifier := topology.NewTopologyVerifier(tc.Clients, Logger)
podGroupVerifier := podgroup.NewPodGroupVerifier(tc.Clients, Logger)

Why is this needed?

Code is organized by domain — k8s/{clients,pods,nodes,resources} for
Kubernetes primitives, grove/{workload,topology,podgroup,config} for
Grove concepts, diagnostics/ for failure collection. Each package is
small, focused, and independently importable.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions