-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Enhancement: refactor and unify checkFuseHealth implementation for all runtimes #4498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement: refactor and unify checkFuseHealth implementation for all runtimes #4498
Conversation
3c4150a to
1a76b7a
Compare
pkg/ctrl/fuse.go
Outdated
| statusToUpdate.FuseNumberAvailable = int32(ds.Status.NumberAvailable) | ||
| if !reflect.DeepEqual(*statusToUpdate, currentStatus) { | ||
| if err := retry.RetryOnConflict(retry.DefaultBackoff, func() (err error) { | ||
| return e.client.Status().Update(context.TODO(), runtime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned that RetryOnConflict is used without incorporating the latest changes, it may repeatedly fail due to persistent conflicts.
fb16bec to
d6eb19f
Compare
…ple dataset failed status with fuse status Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com>
d6eb19f to
a943089
Compare
|
cheyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheyang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Enhancement: refactor and unify checkFuseHealth implementation, decouple dataset failed status with fuse status (#4498) Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com> Co-authored-by: jiuyu <guotongyu.gty@alibaba-inc.com> Signed-off-by: JiGuoDing <485204300@qq.com> * Add Notation to UpdateDatasetStatus in pkg/ddc/jindo/dataset.go. Signed-off-by: JiGuoDing <485204300@qq.com> * Delete Notation of UpdateDatasetStatus in pkg/ddc/jindo/dataset.go. Signed-off-by: JiGuoDing <485204300@qq.com> * Add Notation to UpdateDatasetStatus in pkg/ddc/jindo/dataset.go. Signed-off-by: JiGuoDing <485204300@qq.com> * Modify annotation of UpdateDatasetStatus in pkg/ddc/jindo/dataset.go. Signed-off-by: JiGuoDing <485204300@qq.com> * Modify the comment format to double slashes. Signed-off-by: JiGuoDing <485204300@qq.com> --------- Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com> Signed-off-by: JiGuoDing <485204300@qq.com> Co-authored-by: Syspretor <32930733+Syspretor@users.noreply.github.com> Co-authored-by: jiuyu <guotongyu.gty@alibaba-inc.com>



In the current code implementation, the runtime Interface requires each runtime to implement the
checkRuntimeHealthinterface, which is used to periodically verify whether the runtime is in a healthy state. Each runtime's implementation of this interface includes thecheckFuseHealthmethod, which checks the health status of the fuse daemonset. However, there are currently two issues with the checkFuseHealth implementation:checkFuseHealthmethod, but Fluid actually provides a common implementation for checkFuseHealth.This PR first refactors the common checkFuseHealth implementation. All runtimes now call this common function, and the main logic is consolidated into this common function. In this function, if the fuse daemonset status check fails, it will only result in a runtime warning event and will not cause the dataset to change to the failed status.