Skip to content

Conversation

@Syspretor
Copy link
Collaborator

In the current code implementation, the runtime Interface requires each runtime to implement the checkRuntimeHealth interface, which is used to periodically verify whether the runtime is in a healthy state. Each runtime's implementation of this interface includes the checkFuseHealth method, which checks the health status of the fuse daemonset. However, there are currently two issues with the checkFuseHealth implementation:

  1. Each runtime has redundantly implemented the checkFuseHealth method, but Fluid actually provides a common implementation for checkFuseHealth.
  2. The fuse status affects the bound status of the Dataset. For example, if some pods of the Fuse daemonset are not Ready, it will cause the entire dataset to revert from bound to failed status, which is unreasonable.

This PR first refactors the common checkFuseHealth implementation. All runtimes now call this common function, and the main logic is consolidated into this common function. In this function, if the fuse daemonset status check fails, it will only result in a runtime warning event and will not cause the dataset to change to the failed status.

@Syspretor Syspretor force-pushed the enhancement/refactor-and-unify-check-fuse-health branch 5 times, most recently from 3c4150a to 1a76b7a Compare February 14, 2025 01:38
@cheyang
Copy link
Collaborator

cheyang commented Feb 14, 2025

pkg/ctrl/fuse.go Outdated
statusToUpdate.FuseNumberAvailable = int32(ds.Status.NumberAvailable)
if !reflect.DeepEqual(*statusToUpdate, currentStatus) {
if err := retry.RetryOnConflict(retry.DefaultBackoff, func() (err error) {
return e.client.Status().Update(context.TODO(), runtime)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that RetryOnConflict is used without incorporating the latest changes, it may repeatedly fail due to persistent conflicts.

@Syspretor Syspretor force-pushed the enhancement/refactor-and-unify-check-fuse-health branch 3 times, most recently from fb16bec to d6eb19f Compare February 14, 2025 05:50
…ple dataset failed status with fuse status

Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com>
@Syspretor Syspretor force-pushed the enhancement/refactor-and-unify-check-fuse-health branch from d6eb19f to a943089 Compare February 18, 2025 03:43
@sonarqubecloud
Copy link

Copy link
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@fluid-e2e-bot
Copy link

fluid-e2e-bot bot commented Feb 18, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fluid-e2e-bot fluid-e2e-bot bot merged commit f6191ea into fluid-cloudnative:master Feb 18, 2025
14 checks passed
fluid-e2e-bot bot pushed a commit that referenced this pull request Feb 20, 2025
* Enhancement: refactor and unify checkFuseHealth implementation, decouple dataset failed status with fuse status (#4498)

Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com>
Co-authored-by: jiuyu <guotongyu.gty@alibaba-inc.com>
Signed-off-by: JiGuoDing <485204300@qq.com>

* Add Notation to UpdateDatasetStatus in pkg/ddc/jindo/dataset.go.

Signed-off-by: JiGuoDing <485204300@qq.com>

* Delete Notation of UpdateDatasetStatus in pkg/ddc/jindo/dataset.go.

Signed-off-by: JiGuoDing <485204300@qq.com>

* Add Notation to UpdateDatasetStatus in pkg/ddc/jindo/dataset.go.

Signed-off-by: JiGuoDing <485204300@qq.com>

* Modify annotation of UpdateDatasetStatus in pkg/ddc/jindo/dataset.go.

Signed-off-by: JiGuoDing <485204300@qq.com>

* Modify the comment format to double slashes.

Signed-off-by: JiGuoDing <485204300@qq.com>

---------

Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com>
Signed-off-by: JiGuoDing <485204300@qq.com>
Co-authored-by: Syspretor <32930733+Syspretor@users.noreply.github.com>
Co-authored-by: jiuyu <guotongyu.gty@alibaba-inc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants