fix(helm): Improve chart test script (fixes #1782):#1784
Conversation
- Fix directory name from `results-cache` to `results_cache` to match PV hostPath - Add `wait_for_pods()` function to wait for jobs and pods to be ready - Replace `helm uninstall || true` with `helm uninstall --ignore-not-found`
WalkthroughThe test script was updated to standardise directory naming from Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
tools/deployment/package-helm/test.sh(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: package-image
- GitHub Check: build (macos-15)
- GitHub Check: build (ubuntu-24.04)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
🔇 Additional comments (3)
tools/deployment/package-helm/test.sh (3)
56-56: Correct directory name fix.This change correctly addresses issue #1782 by changing
results-cachetoresults_cache, aligning the directory name with the PV hostPath rename from PR #1771.
75-75: Improved Helm cleanup command.The change from
helm uninstall || truetohelm uninstall test --ignore-not-foundis a good improvement. The--ignore-not-foundflag is more explicit and only suppresses "not found" errors rather than all errors, providing better error handling.
79-79: Appropriate wait parameters.The function call with a 300-second overall timeout, 5-second poll interval, and 5-second wait timeout is well-suited for a test environment. This ensures pods are ready before the script proceeds.
| # Waits for all jobs to complete and all non-job pods to be ready. | ||
| # | ||
| # @param {int} timeout_seconds Overall timeout in seconds | ||
| # @param {int} poll_interval_seconds Interval between status checks | ||
| # @param {int} wait_timeout_seconds Timeout for each kubectl wait call | ||
| # @return {int} 0 on success, 1 on timeout | ||
| wait_for_pods() { | ||
| local timeout_seconds=$1 | ||
| local poll_interval_seconds=$2 | ||
| local wait_timeout_seconds=$3 | ||
|
|
||
| echo "Waiting for all pods to be ready" \ | ||
| "(timeout=${timeout_seconds}s, poll=${poll_interval_seconds}s," \ | ||
| "wait=${wait_timeout_seconds}s)..." | ||
|
|
||
| # Reset bash built-in SECONDS counter | ||
| SECONDS=0 | ||
|
|
||
| while true; do | ||
| sleep "${poll_interval_seconds}" | ||
| kubectl get pods | ||
|
|
||
| if kubectl wait job \ | ||
| --all \ | ||
| --for=condition=Complete \ | ||
| --timeout="${wait_timeout_seconds}s" 2>/dev/null \ | ||
| && kubectl wait pods \ | ||
| --all \ | ||
| --selector='!job-name' \ | ||
| --for=condition=Ready \ | ||
| --timeout="${wait_timeout_seconds}s" 2>/dev/null | ||
| then | ||
| echo "All jobs completed and services are ready." | ||
| return 0 | ||
| fi | ||
|
|
||
| if [[ ${SECONDS} -ge ${timeout_seconds} ]]; then | ||
| echo "ERROR: Timed out waiting for pods to be ready" | ||
| return 1 | ||
| fi | ||
|
|
||
| echo "---" | ||
| done | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and read the relevant lines
if [ -f "tools/deployment/package-helm/test.sh" ]; then
echo "File found. Reading lines 9-52:"
sed -n '9,52p' "tools/deployment/package-helm/test.sh"
else
echo "File not found at tools/deployment/package-helm/test.sh"
echo "Searching for test.sh files:"
find . -name "test.sh" -type f | head -20
fiRepository: y-scope/clp
Length of output: 1482
🏁 Script executed:
# Verify kubectl wait behavior with no jobs using kubectl documentation and basic testing logic
# We'll check the kubectl API behavior for wait command
# Check kubectl wait help documentation
kubectl wait --help 2>/dev/null | grep -A 20 "for=condition=" || echo "kubectl documentation check"
# Test the behavior logically:
# According to Kubernetes documentation, kubectl wait with --all
# should return 0 when there are no resources matching the selector
echo "Checking kubectl wait behavior assumptions..."
</shellRepository: y-scope/clp
Length of output: 199
🌐 Web query:
kubectl wait command behavior when no resources match selector empty namespace
💡 Result:
Short answer
- kubectl wait will not block if your selector matches no resources in the given namespace — it returns immediately with "No resources found" / "error: no matching resources found" (it only waits for resources that already exist). [1][2][3]
Why
- wait operates on the set of matching resources and only watches existing objects for the requested condition. If the selector/namespace yields an empty set, there's nothing to observe, so the command exits instead of waiting. [1][2]
Workarounds
- If you want to wait for a resource to be created, use --for=create. Example: kubectl wait --for=create -l app=foo pod -n myns --timeout=60s.
- Alternatively, wait on a higher-level object that already exists (e.g., Deployment with condition=available) or implement a loop that retries kubectl wait until resources appear. [1][2][3]
Sources
- kubectl wait reference. [1]
- Community discussion / issue explaining immediate return when selector matches nothing. [2]
- Stack Overflow examples of the "no matching resources found" behavior and mitigation. [3]
Critical issue: Function fails when namespace has no jobs.
The wait_for_pods function has a logic flaw that causes it to timeout instead of succeeding in deployments without Kubernetes Jobs. When kubectl wait job --all executes with no matching jobs, it immediately returns an error (not success) with "error: no matching resources found". Because the function uses && logic between the job wait and pod wait commands (lines 27-32), the entire condition fails when no jobs exist, forcing the loop to retry until the overall timeout is exceeded.
This breaks the function for common scenarios such as Helm charts that deploy only Pods/Services without Jobs. The function needs to handle empty resource sets gracefully, either by:
- Checking if jobs exist before attempting
kubectl wait job, or - Using conditional logic to treat "no resources found" as a success condition
The timeout handling logic is sound, but this condition check must be fixed before merging.
🤖 Prompt for AI Agents
In tools/deployment/package-helm/test.sh around lines 9-52, the current compound
condition uses `kubectl wait job --all` which fails when there are no jobs and
causes the function to timeout; update the logic to treat "no matching resources
found" as success by first checking for jobs (e.g., `kubectl get jobs
--no-headers -o name` and only run `kubectl wait job` if any exist) or by
running the wait and treating an exit indicating no resources as non-fatal
(capture exit code/stderr and proceed to the pods wait). Ensure the overall
condition succeeds when there are zero jobs so only the pod readiness check
blocks progress.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Updated the PR title & body directly.
results-cachetoresults_cacheto match PV'shostPath.wait_for_pods()function to wait for jobs and pods to be ready.helm uninstall || truewithhelm uninstall --ignore-not-found.Description
Note
This PR is part of the ongoing work for #1309. More PRs will be submitted until the Helm chart is complete and fully functional.
Fix the directory name mismatch introduced in #1771, where the Kubernetes PV hostPath was changed from
results-cachetoresults_cachebut the test script wasn't updated to match.Also improve the Helm test script by adding a
wait_for_podsfunction that waits for all jobs to complete and all non-job pods to be ready before considering the test successful.Changes include:
results-cachetoresults_cacheto match the PV hostPath (fixes Update test.sh to use results_cache directory naming #1782)wait_for_pods()function with configurable timeout, poll interval, and wait timeout parameterskubectl waitto wait for job completion and pod readinesshelm uninstall test || truewithhelm uninstall test --ignore-not-foundfor cleaner error handlingChecklist
breaking change.
Validation performed
wait_for_podsfunction correctly waits for all pods to be readySummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.