feat(helm): Add readiness and liveness probes to query-scheduler (resolves #1813).#1816
Conversation
|
Warning Rate limit exceeded@junhaoliao has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 27 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
sitaowang1998
left a comment
There was a problem hiding this comment.
LGTM.
However, I do feel that in long term as our system faces scalability limits, the liveness check will fail even though our system is still functioning, but just slow. A restart in this case would further degrade the system, and probably propagate the issue and render the whole system not functioning at all. This is well-documented in the k8s doc.
A naïve check on a serving port is fine for now, but we need to come up with better solution that works under heavy load.
Description
Note
This PR is part of the ongoing work for #1309. More PRs will be submitted until the Helm chart is complete and fully functional.
Add readiness and liveness probes to the query-scheduler deployment to improve reliability and enable Kubernetes to properly manage the pod lifecycle. This aligns the Helm deployment with the healthcheck settings already defined in the docker-compose configuration.
The probes use:
tcpSocketcheck on port 7000 (the query-scheduler internal port)clp.readinessProbeTimingsandclp.livenessProbeTimingshelpersChecklist
breaking change.
Validation performed