-
Notifications
You must be signed in to change notification settings - Fork 7.5k
Closed
Labels
bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.
Description
What happened?
In Flux we have a simple test case for running a release test that fails. The test is a Pod like this:
{{- if .Values.faults.testFail }}
apiVersion: v1
kind: Pod
metadata:
name: {{ template "podinfo.fullname" . }}-fault-test-{{ randAlphaNum 5 | lower }}
namespace: {{ include "podinfo.namespace" . }}
labels:
{{- include "podinfo.labels" . | nindent 4 }}
annotations:
"helm.sh/hook": test-success
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
sidecar.istio.io/inject: "false"
linkerd.io/inject: disabled
appmesh.k8s.aws/sidecarInjectorWebhook: disabled
spec:
containers:
- name: fault
image: alpine:3.11
command: ['/bin/sh']
args: ['-c', 'exit 1']
restartPolicy: Never
{{- end }}The legacy waiter from Helm 3 would see the Pod with .status.phase set to Failed and then bail out from waiting:
// waitForPodSuccess is a helper that waits for a pod to complete.
//
// This operates on an event returned from a watcher.
func (hw *legacyWaiter) waitForPodSuccess(obj runtime.Object, name string) (bool, error) {
o, ok := obj.(*corev1.Pod)
if !ok {
return true, fmt.Errorf("expected %s to be a *v1.Pod, got %T", name, obj)
}
switch o.Status.Phase {
case corev1.PodSucceeded:
slog.Debug("pod succeeded", "pod", o.Name)
return true, nil
case corev1.PodFailed:
slog.Error("pod failed", "pod", o.Name)
return true, fmt.Errorf("pod %s failed", o.Name)The new (hard-coded!) wait strategy for release testing in Helm 4 is the status watcher (kstatus). This waiter does not bail out if it sees a Pod with .status.phase set to Failed:
eventCh := sw.Watch(cancelCtx, resources, watcher.Options{
RESTScopeStrategy: watcher.RESTScopeNamespace,
})
statusCollector := collector.NewResourceStatusCollector(resources)
done := statusCollector.ListenWithObserver(eventCh, statusObserver(cancel, status.CurrentStatus))
<-doneThe result is Helm waiting for the full timeout instead of bailing out early.
What did you expect to happen?
I expected the Helm 3 behavior to be preserved.
How can we reproduce it (as minimally and precisely as possible)?
Repro steps:
Just install a chart with the test Pod above in a kind cluster using Helm 4 and observe that Helm will wait for the full timeout instead of bailing out early.
Helm version
Helm 4
Kubernetes version
1.34
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.