Releases: kubernetes-sigs/kueue
v0.16.0-rc.0
Changes since v0.15.0:
Urgent Upgrade Notes
(No, really, you MUST read this before you upgrade)
-
Removed FlavorFungibilityImplicitPreferenceDefault feature gate.
Configure flavor selection preference using the ClusterQueue field
spec.flavorFungibility.preferenceinstead. (#8134, @mbobrovskyi) -
The short name "wl" for workloads has been removed to avoid potential conflicts with the in-tree workload object coming into Kubernetes (#8472, @kannon92)
Changes by Kind
API Change
-
Add field multiplyBy for ResourceTransformation (#7599, @calvin0327)
-
V1beta2: Use v1beta2 as storage version in v0.16
The v1beta1 API version will no longer be served in v0.17 (new resources cannot be created with v1beta1) and will be fully removed in v0.18.
Migrate all existing Kueue resources from
kueue.x-k8s.io/v1beta1tokueue.x-k8s.io/v1beta2after upgrading to v0.16 and before upgrading to v0.17.Kueue conversion webhooks handle structural changes automatically – the migration only updates the stored apiVersion.
Migration instructions (including the official script): #8018. (#8020, @mbobrovskyi)
Feature
- Adds support for PodsReady when JobSet dependsOn is used. (#7889, @MaysaMacedo)
- CLI: Support "kwl" and "kueueworkload" as a shortname for Kueue Workloads. (#8379, @kannon92)
- ClusterQueues with both MultiKueue and ProvisioningRequest admission checks are now marked as inactive with reason "MultiKueueWithProvisioningRequest", as this configuration is invalid on manager clusters. (#8451, @IrvingMg)
- Enable Pod-based integrations by default (#8096, @sohankunkerkar)
- Logs now include
replica-rolefield to identify Kueue instance roles (leader/follower/standalone). (#8107, @IrvingMg) - MultiKueue: trigger workload eviction on the management cluster when the corresponding workload is evicted
on the worker remote cluster. In particular this is fixing the issue with workloads using ProvisioningRequests,
which could get stuck in a worker cluster which does not have enough capacity to ever admit the workloads. (#8477, @mszadkow) - Observability: Add more details (the preemptionMode) to the QuotaReserved condition message,
and the related event, about the skipped flavors which were considered for preemption.
Before: "Quota reserved in ClusterQueue preempt-attempts-cq, wait time since queued was 9223372037s; Flavors considered: main: on-demand(Preempt;insufficient unused quota for cpu in flavor on-demand, 1 more needed)"
After: "Quota reserved in ClusterQueue preempt-attempts-cq, wait time since queued was 9223372037s; Flavors considered: main: on-demand(preemptionMode=Preempt;insufficient unused quota for cpu in flavor on-demand, 1 more needed)" (#8024, @mykysha) - Ray: Support RayJob InTreeAutoscaling by using the ElasticJobsViaWorkloadSlices feature. (#8082, @hiboyang)
- TAS: extend the information in condition messages and events about nodes excluded from calculating the
assignment due to various recognized reasons like: taints, node affinity, node resource constraints. (#8043, @sohankunkerkar)
Bug or Regression
-
Add lws editer and viewer roles to kustomize and helm (#8513, @kannon92)
-
DRA: fix the race condition bug leading to undefined behavior due to concurrent operations
on the Workload object, manifested by the "WARNING: DATA RACE" in test logs. (#8073, @mbobrovskyi) -
Fix ClusterQueue deletion getting stuck when pending workloads are deleted after being assumed by the scheduler. (#8543, @sohankunkerkar)
-
Fix EnsureWorkloadSlices to finish old slice when new is admitted as replacement (#8456, @sohankunkerkar)
-
Fix
TrainJobcontroller not correctly setting thePodSetcount value based onnumNodesfor the expected number of training nodes. (#8135, @kaisoz) -
Fix a bug that WorkloadPriorityClass value changes do not trigger Workload priority updates. (#8442, @ASverdlov)
-
Fix a performance bug as some "read-only" functions would be taking unnecessary "write" lock. (#8181, @ErikJiang)
-
Fix the race condition bug where the kueue_pending_workloads metric may not be updated to 0 after the last
workload is admitted and there are no new workloads incoming. (#8037, @Singularity23x0) -
Fixed a bug that Kueue's scheduler would re-evaluate and update already finished workloads, significantly
impacting overall scheduling throughput. This re-evaluation of a finished workload would be triggered when:- Kueue is restarted
- There is any event related to LimitRange or RuntimeClass instances referenced by the workload (#8186, @mbobrovskyi)
-
Fixed the following bugs for the StatefulSet integration by ensuring the Workload object
has the ownerReference to the StatefulSet:- Kueue doesn't keep the StatefulSet as deactivated
- Kueue marks the Workload as Finished if all StatefulSet's Pods are deleted
- changing the "queue-name" label could occasionally result in the StatefulSet getting stuck (#4799, @mbobrovskyi)
-
HC: Avoid redundant requeuing of inadmissible workloads when multiple ClusterQueues in the same cohort hierarchy are processed. (#8441, @sohankunkerkar)
-
Integrations based on Pods: skip using finalizers on the Pods created and managed by integrations.
In particular we skip setting finalizers for Pods managed by the built in Serving Workloads Deployments,
StatefulSets, and LeaderWorkerSets.This improves performance of suspending the workloads, and fixes occasional race conditions when a StatefulSet
could get stuck when deactivating and re-activating in a short interval. (#8530, @mbobrovskyi) -
JobFramework: Fixed a bug that allowed a deactivated workload to be activated. (#8424, @chengjoey)
-
Kubeflow TrainJob v2: fix the bug to prevent duplicate pod template overrides when starting the Job is retried. (#8269, @j-skiba)
-
MultiKueue now waits for WorkloadAdmitted (instead of QuotaReserved) before deleting workloads from non-selected worker clusters. To revert to the previous behavior, disable the
MultiKueueWaitForWorkloadAdmittedfeature gate. (#8592, @IrvingMg) -
MultiKueue via ClusterProfile: Fix the panic if the configuration for ClusterProfiles wasn't not provided in the configMap. (#8071, @mszadkow)
-
MultiKueue: Fix a bug that the priority change by mutating the
kueue.x-k8s.io/priority-classlabel on the management cluster is not propagated to the worker clusters. (#8464, @mbobrovskyi) -
MultiKueue: Fixed status sync for CRD-based jobs (JobSet, Kubeflow, Ray, etc.) that was blocked while the local job was suspended. (#8308, @IrvingMg)
-
MultiKueue: fix the bug that for Pod integration the AdmissionCheck status would be kept Pending indefinitely,
even when the Pods are already running.The analogous fix is also done for the batch/Job when the MultiKueueBatchJobWithManagedBy feature gate is disabled. (#8189, @IrvingMg)
-
MultiKueue: fix the eviction when initiated by the manager cluster (due to eg. Preemption or WairForPodsReady timeout). (#8151, @mbobrovskyi)
-
ProvisioningRequest: Fixed a bug that prevented events from being updated when the AdmissionCheck state changed. (#8394, @mbobrovskyi)
-
Revert the changes in PR #8599 for transitioning
the QuotaReserved, Admitted conditions toFalsefor Finished workloads. This introduced a regression,
because users lost the useful information about the timestamp of the last transitioning of these
conditions to True, without an API replacement to serve the information. (#8599, @mbobrovskyi) -
Scheduling: fix a bug that evictions submitted by scheduler (preemptions and eviction due to TAS NodeHotSwap failing)
could result in conflict in case of concurrent workload modification by another controller.
This could lead to indefinite failing requests sent by scheduler in some scenarios when eviction is initiated by
TAS NodeHotSwap. (#7933, @mbobrovskyi) -
Scheduling: fix the bug that setting (none -> some) a workload priority class label (kueue.x-k8s.io/priority-class) was ignored. (#8480, @andrewseif)
-
TAS NodeHotSwap: fixed the bug that allows workload to requeue by scheduler even if already deleted on TAS NodeHotSwap eviction. (#8278, @mbobrovskyi)
-
TAS: Fix handling of admission for workloads using the LeastFreeCapacity algorithm when the "unconstrained"
mode is used. In that case scheduling would fail if there is at least one node in the cluster which does not have
enough capacity to accommodate at least one Pod. (#8168, @PBundyra) -
TAS: fix TAS resource flavor controller to extract only scheduling-relevant node updates to prevent unnecessary reconciliation. (#8452, @Ladicle)
-
TAS: fix a performance bug that continues reconciles of TAS ResourceFlavor (and related ClusterQueues)
were triggered by updates to Nodes' heartbeat times. (#8342, @PBundyra) -
TAS: fix bug that when TopologyAwareScheduling is disabled, but there is a ResourceFlavor configured with topologyName, then preemptions fail with "workload requires Topology, but there is no TAS cache information". (#8167, @zhifei92)
-
TAS: fixed performance issue due to unncessary (empty) request by TopologyUngater (#8279, @mbobrovskyi)
Other (Cleanup or Flake)
- Fix: Removed outdated comments incorrectly stating that deployment, statefulset, and leaderworkerset integrations require pod integration to be enabled. (#8053, @IrvingMg)
- Improve error messages for validation errors regarding WorkloadPriorityClass changes in workloads. (#8334, @olekzabl)
- MultiKueue: improve the MultiKueueCluster reconciler to skip attempting to reconcile and throw errors
when the corresponding Secret or ClusterProfile objects don't exist. The reconcile will be triggered on
creation of the objects. (#8144, @mszadkow) - Removes ConfigurableResourceTransformations feature gate. ...
v0.15.2
Changes since v0.15.1:
Changes by Kind
Feature
- Ray: Support RayJob InTreeAutoscaling by using the ElasticJobsViaWorkloadSlices feature. (#8284, @hiboyang)
Bug or Regression
-
Kubeflow TrainJob v2: fix the bug to prevent duplicate pod template overrides when starting the Job is retried. (#8271, @j-skiba)
-
MultiKueue: Fixed status sync for CRD-based jobs (JobSet, Kubeflow, Ray, etc.) that was blocked while the local job was suspended. (#8344, @IrvingMg)
-
MultiKueue: fix the bug that for Pod integration the AdmissionCheck status would be kept Pending indefinitely,
even when the Pods are already running.The analogous fix is also done for the batch/Job when the MultiKueueBatchJobWithManagedBy feature gate is disabled. (#8288, @IrvingMg)
-
Scheduling: fix a bug that evictions submitted by scheduler (preemptions and eviction due to TAS NodeHotSwap failing)
could result in conflict in case of concurrent workload modification by another controller.
This could lead to indefinite failing requests sent by scheduler in some scenarios when eviction is initiated by
TAS NodeHotSwap. (#8313, @mbobrovskyi) -
TAS NodeHotSwap: fixed the bug that allows workload to requeue by scheduler even if already deleted on TAS NodeHotSwap eviction. (#8310, @mbobrovskyi)
-
TAS: fix a performance bug that continues reconciles of TAS ResourceFlavor (and related ClusterQueues)
were triggered by updates to Nodes' heartbeat times. (#8355, @PBundyra) -
TAS: fixed performance issue due to unncessary (empty) request by TopologyUngater (#8333, @mbobrovskyi)
Other (Cleanup or Flake)
- Improve error messages for validation errors regarding WorkloadPriorityClass changes in workloads. (#8352, @olekzabl)
- MultiKueue: improve the MultiKueueCluster reconciler to skip attempting to reconcile and throw errors
when the corresponding Secret or ClusterProfile objects don't exist. The reconcile will be triggered on
creation of the objects. (#8290, @mszadkow)
v0.14.7
Changes since v0.14.6:
Changes by Kind
Feature
- Ray: Support RayJob InTreeAutoscaling by using the ElasticJobsViaWorkloadSlices feature. (#8282, @hiboyang)
Bug or Regression
-
MultiKueue: Fixed status sync for CRD-based jobs (JobSet, Kubeflow, Ray, etc.) that was blocked while the local job was suspended. (#8346, @IrvingMg)
-
MultiKueue: fix the bug that for Pod integration the AdmissionCheck status would be kept Pending indefinitely,
even when the Pods are already running.The analogous fix is also done for the batch/Job when the MultiKueueBatchJobWithManagedBy feature gate is disabled. (#8293, @IrvingMg)
-
Scheduling: fix a bug that evictions submitted by scheduler (preemptions and eviction due to TAS NodeHotSwap failing)
could result in conflict in case of concurrent workload modification by another controller.
This could lead to indefinite failing requests sent by scheduler in some scenarios when eviction is initiated by
TAS NodeHotSwap. (#8314, @mbobrovskyi) -
TAS NodeHotSwap: fixed the bug that allows workload to requeue by scheduler even if already deleted on TAS NodeHotSwap eviction. (#8306, @mbobrovskyi)
-
TAS: fix a performance bug that continues reconciles of TAS ResourceFlavor (and related ClusterQueues)
were triggered by updates to Nodes' heartbeat times. (#8356, @PBundyra) -
TAS: fixed performance issue due to unncessary (empty) request by TopologyUngater (#8337, @mbobrovskyi)
v0.15.1
Changes since v0.15.0:
Changes by Kind
Feature
- TAS: extend the information in condition messages and events about nodes excluded from calculating the
assignment due to various recognized reasons like: taints, node affinity, node resource constraints. (#8132, @sohankunkerkar)
Bug or Regression
- Fix
TrainJobcontroller not correctly setting thePodSetcount value based onnumNodesfor the expected number of training nodes. (#8145, @kaisoz) - Fix a performance bug as some "read-only" functions would be taking unnecessary "write" lock. (#8183, @ErikJiang)
- Fix the race condition bug where the kueue_pending_workloads metric may not be updated to 0 after the last
workload is admitted and there are no new workloads incoming. (#8049, @Singularity23x0) - Fixed a bug that Kueue's scheduler would re-evaluate and update already finished workloads, significantly
impacting overall scheduling throughput. This re-evaluation of a finished workload would be triggered when: - Fixed the following bugs for the StatefulSet integration by ensuring the Workload object
has the ownerReference to the StatefulSet:- Kueue doesn't keep the StatefulSet as deactivated
- Kueue marks the Workload as Finished if all StatefulSet's Pods are deleted
- changing the "queue-name" label could occasionally result in the StatefulSet getting stuck (#8105, @mbobrovskyi)
- MultiKueue via ClusterProfile: Fix the panic if the configuration for ClusterProfiles wasn't not provided in the configMap. (#8097, @mszadkow)
- TAS: Fix handling of admission for workloads using the LeastFreeCapacity algorithm when the "unconstrained"
mode is used. In that case scheduling would fail if there is at least one node in the cluster which does not have
enough capacity to accommodate at least one Pod. (#8172, @PBundyra) - TAS: fix bug that when TopologyAwareScheduling is disabled, but there is a ResourceFlavor configured with topologyName, then preemptions fail with "workload requires Topology, but there is no TAS cache information". (#8195, @zhifei92)
Other (Cleanup or Flake)
v0.14.6
Changes since v0.14.5:
Changes by Kind
Feature
- TAS: extend the information in condition messages and events about nodes excluded from calculating the
assignment due to various recognized reasons like: taints, node affinity, node resource constraints. (#8169, @sohankunkerkar)
Bug or Regression
- Fix
TrainJobcontroller not correctly setting thePodSetcount value based onnumNodesfor the expected number of training nodes. (#8146, @kaisoz) - Fix a performance bug as some "read-only" functions would be taking unnecessary "write" lock. (#8182, @ErikJiang)
- Fix the race condition bug where the kueue_pending_workloads metric may not be updated to 0 after the last
workload is admitted and there are no new workloads incoming. (#8048, @Singularity23x0) - Fixed the following bugs for the StatefulSet integration by ensuring the Workload object
has the ownerReference to the StatefulSet:- Kueue doesn't keep the StatefulSet as deactivated
- Kueue marks the Workload as Finished if all StatefulSet's Pods are deleted
- changing the "queue-name" label could occasionally result in the StatefulSet getting stuck (#8104, @mbobrovskyi)
- TAS: Fix handling of admission for workloads using the LeastFreeCapacity algorithm when the "unconstrained"
mode is used. In that case scheduling would fail if there is at least one node in the cluster which does not have
enough capacity to accommodate at least one Pod. (#8171, @PBundyra) - TAS: fix bug that when TopologyAwareScheduling is disabled, but there is a ResourceFlavor configured with topologyName, then preemptions fail with "workload requires Topology, but there is no TAS cache information". (#8196, @zhifei92)
Other (Cleanup or Flake)
v0.15.0
Changes since v0.14.0:
Urgent Upgrade Notes
(No, really, you MUST read this before you upgrade)
-
MultiKueue: validate remote client kubeconfigs and reject insecure kubeconfigs by default; add feature gate MultiKueueAllowInsecureKubeconfigs to temporarily allow insecure kubeconfigs until v0.17.0.
if you are using MultiKueue kubeconfigs which are not passing the new validation please
enable theMultiKueueAllowInsecureKubeconfigsfeature gate and let us know so that we can re-consider
the deprecation plans for the feature gate. (#7439, @mszadkow) -
The .status.flavors in LocalQueue is deprecated, which will be removed in the future release.
You can consider migrating from the field usage to VisibilityOnDemand. (#7337, @iomarsayed)
- Update DRA API used from
v1beta2tov1
in order to use DRA integration by enabling the DynamicResourceAllocation feature gate in Kueue you need to use k8s 1.34+. (#7212, @harche)
- V1beta2: Expose the v1beta2 API for CRD serving.
V1beta1 remains supported in this release and used as storage, but please plan for migration.
We would highly recommend preparing the Kueue CustomResources API version upgrade (v1beta1 -> v1beta2)
since we plan to use v1beta2 for storage in 0.16, and discontinue the support for v1beta1 in 0.17. (#7304, @mimowo)
Changes by Kind
API Change
-
Removed the deprecated workload annotation key "kueue.x-k8s.io/queue-name".
Please ensure you are using the workload label "kueue.x-k8s.io/queue-name" instead. (#7271, @ganczak-commits)
-
V1beta2: Delete .enable field from FairSharing API in config (#7583, @mbobrovskyi)
-
V1beta2: Delete .enable field from WaitForPodsReady API in config (#7628, @mbobrovskyi)
-
V1beta2: FlavorFungibility: introduce
MayStopSearchin place ofBorrow/Preempt, which are now deprecated in v1beta1. (#7117, @ganczak-commits) -
V1beta2: Graduate Config API to v1beta2. v1beta1 remains supported for this release, but please plan for migration. (#7375, @mbobrovskyi)
-
V1beta2: Make .waitForPodsReady.timeout required field in the Config API (#7952, @tenzen-y)
-
V1beta2: Make fairSharing.premptionStrategies required field in Config API (#7948, @tenzen-y)
-
V1beta2: Remove deprecated PodIntegrationOptions (podOptions field) from v1beta2 Configuration.
If you are using the podOptions in the configMap, you need to migrate to using managedJobsNamespaceSelector (https://kueue.sigs.k8s.io/docs/tasks/run/plain_pods/) before
the upgrade. (#7406, @nerdeveloper) -
V1beta2: Remove deprecated QueueVisibility in configMap (it was already non-functional). (#7319, @bobsongplus)
-
V1beta2: Remove deprecated retryDelayMinutes field from v1beta2 AdmissionCheckSpec (it was already non-functional). (#7407, @nerdeveloper)
-
V1beta2: Remove never used .status.fairSharing.admissionFairSharing field from ClusterQueue and Cohort (#7793, @tenzen-y)
-
V1beta2: Removed deprecated Preempt/Borrow from FlavorFungibility API (#7527, @mbobrovskyi)
-
V1beta2: The internal representation of TopologyAssignment (in WorkloadStatus) has been reorganized to allow using TAS for larger workloads. (More specifically, under the assumptions described in issue #7220, it allows to increase the maximal workload size from approx. 20k to approx. 60k nodes). (#7544, @olekzabl)
-
V1beta2: change default for waitForPodsReady.blockAdmission to false (#7687, @mbobrovskyi)
-
V1beta2: drop deprecated Flavors field from LocalQueueStatus (#7449, @mbobrovskyi)
-
V1beta2: graduate the visibility API (#7411, @mbobrovskyi)
-
V1beta2: introduce PriorityClassRef instead of PriorityClassSource and PriorityClassName (#7540, @mbobrovskyi)
-
V1beta2: remove deprecated .spec.admissionChecks field from ClusterQueue API in favor of .spec.admissionChecksStrategy. (#7490, @nerdeveloper)
-
ReclaimablePodsfeature gate is introduced to enable users switching on and off the reclaimable Pods feature (#7525, @PBundyra)
Feature
-
AdmissionChecks: introduce new optional fields in the workload status for admission checks to control the delay by
external and internal admission check controllers:- requeueAfterSeconds: specifies minimum wait time before retry
- retryCount: Tracks retry attempts per admission check (#7620, @sohankunkerkar)
-
AdmissionFairSharing: promote the feature to beta (enabled by default). (#7463, @kannon92)
-
FailureRecovery: Introduce a mechanism to terminate Pods "stuck" in a terminating state due to node failures.
The feature is activated by enabling the alpha FailureRecoveryPolicy feature gate (disabled by default).
Only Pods with the kueue.x-k8s.io/safe-to-forcefully-terminate annotation are handled by the mechanism. (#7312, @kshalot) -
FlavorFungibility: introduce the ClusterQueue's API for flavorFungibility:
.spec.flavorFungibility.preferenceto indicate
the user's preference for borrowing or preemption when there is no flavor which avoids both.
This new field is a replacement for the alpha feature gate FlavorFungibilityImplicitPreferenceDefault which is considered as deprecated in 0.15 and will be removed in 0.16. (#7316, @vladikkuzn) -
Integrations: the Pod integration is no longer required to be enabled explicitly in the configMap when you are using LeaderWorkerSet, StatefulSet, or Deployment frameworks. (#6736, @IrvingMg)
-
JobFramework: Introduce an optional interface for custom Jobs, called JobWithCustomWorkloadActivation, which can be used to deactivate or active a custom CRD workload. (#7199, @tg123)
-
KueuePopulator: release of the new experimental sub-project called "kueue-populator". It allows to create the default ClusterQueue, ResourceFlavor and Topology. It also creates default LocalQueues in all namespaces managed by Kueue. (#7940, @mbobrovskyi)
-
MultiKueue: Graduate the support for running external jobs to Beta. (#7669, @khrm)
-
MultiKueue: It supports Topology Aware Scheduling (TAS) and ProvisioningRequest integration. (#5361, @IrvingMg)
-
MultiKueue: Promote MultiKueueBatchJobWithManagedBy to beta which allows to synchronize the Job status periodically during Job execution between the worker and the management cluster for k8s batch Jobs. (#7341, @kannon92)
-
MultiKueue: Support for authentication to worker clusters using ClusterProfile API. (#7570, @hdp617)
-
Observability: Adjust the
cluster_queue_weighted_shareandcohort_weighted_sharemetrics to report the precise value for the Weighted share, rather than the value rounded to an integer. Also, expand thecluster_queue_weighted_sharemetric with the "cohort" label. (#7338, @j-skiba) -
Observability: Improve the messages presented to the user in scheduling events, by clarifying the reason for "insufficient quota" in case of workloads with multiple PodSets.
Before: "insufficient quota for resource-type in flavor example-flavor, request > maximum capacity (24 > 16)"
After: "insufficient quota for resource-type in flavor example-flavor, previously considered podsets requests (16) + current podset request (8) > maximum capacity (16)" (#7232, @iomarsayed) -
Observability: Summarize the list of flavors considered for admission in the release cycle, but not used eventually for a workload which reserved the quota.
The summary is present in the message for the QuotaReserved condition, and in the event.
Before: "Quota reserved in ClusterQueue tas-main, wait time since queued was 9223372037s"
After: "Quota reserved in ClusterQueue tas-main, wait time since queued was 9223372037s; Flavors considered: one: default(NoFit;Flavor "default" does not support TopologyAwareScheduling)" (#7646, @mykysha) -
Observability: improve the message for the Preempted condition: include preemptor and preemptee object paths to make it easier to locate the objects involved in a preemption.
Before: "Preempted to accommodate a workload (UID: wl-in, JobUID: job-in) due to reclamation within the cohort"
After: "Preempted to accommodate a workload (UID: wl-in, JobUID: job-in) due to reclamation within the cohort; preemptor path: /r/c/q; preemptee path: /r/q_borrowing" (#7522, @mszadkow) -
Promote ManagedJobsNamespaceSelectorAlwaysRespected feature to Beta (#7493, @PannagaRao)
-
Scheduling: support mutating the "kueue.x-k8s.io/workloadpriorityclass" label for Jobs with reserved quota. (#7289, @mbobrovskyi)
-
TAS: The balanced placement is introduced with the TASBalancedPlacement feature gate. (#6851, @pajakd)
-
TAS: change the algorithm used in case of "unconstrained" mode (enabled by the kueue.x-k8s.io/podset-unconstrained-topology annotation, or when the "implicit" mode s used) from "BestFit" to "LeastFreeCapacity".
This allows to optimize the fragmentation for workloads which don't require bin-packing. (#7416, @iomarsayed)
-
Transition QuotaReserved to false whenever setting Finished conditions (#7724, @mbobrovskyi)
Documentation
Bug or Regression
- AdmissionFairSharing: Fix the bug that occasionally a workload may get admitted from a busy LocalQueue,
bypassing the entry penalties. (#7780, @IrvingMg) - Fix a bug that an error during workload preemption could leave the scheduler stuck without retrying. (#7665, @olekzabl)
- Fix a bug that the cohort client-go lib is for a Namespaced resource, even though the cohort is a Cluster-scoped resource. (#7799, @tenzen-y)
- Fix a bug where a workload would not get requeued after eviction due to failed hotswap. (#7376, @pajakd)
- Fix eviction of jobs with memory requests in decimal format (#7430, @brejman)
- Fix existing workloads not being re-evaluated when new clusters are added to MultiKueueConfig. Previously, only newly created workloads would see updated cluster lists. (#6732, @ravisantoshgudimetla)
- Fix handling of Ray...
v0.14.5
Changes since v0.14.4:
Urgent Upgrade Notes
(No, really, you MUST read this before you upgrade)
-
TAS: It supports the Kubeflow TrainJob.
You should update Kubeflow Trainer to v2.1.0 at least when using Trainer v2. (#7755, @IrvingMg)
Changes by Kind
Bug or Regression
-
AdmissionFairSharing: Fix the bug that occasionally a workload may get admitted from a busy LocalQueue,
bypassing the entry penalties. (#7914, @IrvingMg) -
Fix a bug that an error during workload preemption could leave the scheduler stuck without retrying. (#7818, @olekzabl)
-
Fix a bug that the cohort client-go lib is for a Namespaced resource, even though the cohort is a Cluster-scoped resource. (#7802, @tenzen-y)
-
Fix integration of
manageJobWithoutQueueNameandmanagedJobsNamespaceSelectorwith JobSet by ensuring that jobSets without a queue are not managed by Kueue if are not selected by themanagedJobsNamespaceSelector. (#7762, @MaysaMacedo) -
Fix issue #6711 where an inactive workload could transiently get admitted into a queue. (#7939, @olekzabl)
-
Fix the bug that a workload which was deactivated by setting the
spec.active=falsewould not have the
wl.Status.RequeueStatecleared. (#7768, @sohankunkerkar) -
Fix the bug that the kubernetes.io/job-name label was not propagated from the k8s Job to the PodTemplate in
the Workload object, and later to the pod template in the ProvisioningRequest.As a consequence the ClusterAutoscaler could not properly resolve pod affinities referring to that label,
via podAffinity.requiredDuringSchedulingIgnoredDuringExecution.labelSelector. For example,
such pod affinities can be used to request ClusterAutoscaler to provision a single node which is large enough
to accommodate all Pods on a single Node.We also introduce the PropagateBatchJobLabelsToWorkload feature gate to disable the new behavior in case of
complications. (#7613, @yaroslava-serdiuk) -
Fix the race condition which could result that the Kueue scheduler occasionally does not record the reason
for admission failure of a workload if the workload was modified in the meanwhile by another controller. (#7884, @mbobrovskyi) -
TAS: Fix the
requiredDuringSchedulingIgnoredDuringExecutionnode affinity setting being ignored in topology-aware scheduling. (#7937, @kshalot)
v0.13.10
Changes since v0.13.9:
Changes by Kind
Bug or Regression
-
AdmissionFairSharing: Fix the bug that occasionally a workload may get admitted from a busy LocalQueue,
bypassing the entry penalties. (#7916, @IrvingMg) -
Fix a bug that an error during workload preemption could leave the scheduler stuck without retrying. (#7817, @olekzabl)
-
Fix a bug that the cohort client-go lib is for a Namespaced resource, even though the cohort is a Cluster-scoped resource. (#7801, @tenzen-y)
-
Fix integration of
manageJobWithoutQueueNameandmanagedJobsNamespaceSelectorwith JobSet by ensuring that jobSets without a queue are not managed by Kueue if are not selected by themanagedJobsNamespaceSelector. (#7761, @MaysaMacedo) -
Fix issue #6711 where an inactive workload could transiently get admitted into a queue. (#7944, @olekzabl)
-
Fix the bug that the kubernetes.io/job-name label was not propagated from the k8s Job to the PodTemplate in
the Workload object, and later to the pod template in the ProvisioningRequest.As a consequence the ClusterAutoscaler could not properly resolve pod affinities referring to that label,
via podAffinity.requiredDuringSchedulingIgnoredDuringExecution.labelSelector. For example,
such pod affinities can be used to request ClusterAutoscaler to provision a single node which is large enough
to accommodate all Pods on a single Node.We also introduce the PropagateBatchJobLabelsToWorkload feature gate to disable the new behavior in case of
complications. (#7613, @yaroslava-serdiuk) -
TAS: Fix the
requiredDuringSchedulingIgnoredDuringExecutionnode affinity setting being ignored in topology-aware scheduling. (#7936, @kshalot)
v0.14.4
Changes since v0.14.3:
Changes by Kind
Feature
ReclaimablePodsfeature gate is introduced to enable users switching on and off the reclaimable Pods feature (#7537, @PBundyra)
Bug or Regression
- Fix eviction of jobs with memory requests in decimal format (#7556, @brejman)
- Fix the bug for the StatefulSet integration that the scale up could get stuck if
triggered immediately after scale down to zero. (#7500, @IrvingMg) - MultiKueue: Remove remoteClient from clusterReconciler when kubeconfig is detected as invalid or insecure, preventing workloads from being admitted to misconfigured clusters. (#7517, @mszadkow)
v0.13.9
Changes since v0.13.8:
Changes by Kind
Feature
ReclaimablePodsfeature gate is introduced to enable users switching on and off the reclaimable Pods feature (#7536, @PBundyra)
Bug or Regression
- Fix eviction of jobs with memory requests in decimal format (#7557, @brejman)
- Fix the bug for the StatefulSet integration that the scale up could get stuck if
triggered immediately after scale down to zero. (#7499, @IrvingMg) - MultiKueue: Remove remoteClient from clusterReconciler when kubeconfig is detected as invalid or insecure, preventing workloads from being admitted to misconfigured clusters. (#7516, @mszadkow)