Skip to content

Conversation

@Syspretor
Copy link
Collaborator

@Syspretor Syspretor commented Feb 20, 2025

As Fluid is increasingly being adopted by more users in production environments, the number of datasets within clusters has reached the order of thousands. This places higher demands on the overall performance of the RuntimeController. The most critical factor here is the time taken for a dataset to become bounded after its creation, as this directly affects the readiness time of business operations. Through extensive testing, it has been observed that the time for a dataset to become bound increases exponentially with the number of datasets. In a cluster environment with thousands of datasets, the average time for a single dataset to become bound has reached 40 seconds. Therefore, this PR aims to optimize the performance of the RuntimeController in large-scale dataset clusters.

Testing has revealed that the main factor affecting the RuntimeController's ability to bind a runtime and dataset is the significant amount of repeated enqueuing of runtimes/datasets. This causes the reconcile-worker to remain in an active state continuously, resulting in queue blockage and delays. There are two major sources of repeated enqueuing:

  1. All runtimes are enqueued by default at 90-second intervals. This action is intended to periodically sync the status of runtimes/datasets.
  2. Dataset updates lead to the corresponding runtime being enqueued as well.

This PR focuses on optimizing the controller's performance in handling runtimes by addressing these two issues. It introduces two configurable environment variables for the controller: FLUID_RUNTIME_RECONCILE_DURATION and FLUID_RUNTIME_RECONCILE_DURATION_OFFSET.

For runtimes that do not require immediate dataset cache status updates, FLUID_RUNTIME_RECONCILE_DURATION (in seconds, default is 90) can be used to control the default reconcile interval. Increasing this interval appropriately can help reduce the pressure on the reconcile queue.

env: 
- name: FLUID_RUNTIME_RECONCILE_DURATION
  value: "180"

When FLUID_RUNTIME_RECONCILE_DURATION is set to -1, runtimes like Thinruntime, which do not support reporting cache stats, will not be reconciled periodically. However, updates to the dataset, runtime, or runtime workload will still trigger the runtime to be enqueued for reconciliation.
These changes are intended to significantly reduce queue pressure and improve performance.

env: 
- name: FLUID_RUNTIME_RECONCILE_DURATION
  value: "-1"

The FLUID_RUNTIME_RECONCILE_DURATION_OFFSET is designed to optimize the batch creation of datasets/runtimes. In batch creation scenarios, all runtimes are enqueued simultaneously, and after processing, they are re-enqueued at the same time interval set by FLUID_RUNTIME_RECONCILE_DURATION. This can lead to suboptimal utilization of reconcile workers. To address this, we can configure FLUID_RUNTIME_RECONCILE_DURATION_OFFSET along with FLUID_RUNTIME_RECONCILE_DURATION to set a random enqueuing time within a specified range. For instance, the following configuration allows runtimes to be enqueued at a random interval between 80 to 160 seconds. This aims to stagger the re-enqueuing times of batch-created datasets/runtimes, thereby improving the utilization of reconcile workers.

env: 
- name: FLUID_RUNTIME_RECONCILE_DURATION
  value: "120"
- name: FLUID_RUNTIME_RECONCILE_DURATION_OFFSET
  value: "40"

On the other hand, this PR optimizes the logic for updating the state of a dataset during the sync process, thereby preventing runtimes from being passively enqueued due to frequent dataset updates.

Testing has shown that, after optimization, the time taken for a dataset to become bound in a cluster with thousands of datasets has been reduced from an average of over 40 seconds to just over 1 second. Furthermore, this performance is maintained even in larger cluster sizes.

@Syspretor Syspretor requested review from TrafalgarZZZ and cheyang and removed request for cheyang February 20, 2025 02:37
@Syspretor Syspretor force-pushed the enhancement/improve-runtime-controller-in-large-scale-scenarios branch from 50768b5 to bb2e883 Compare February 20, 2025 02:38
@Syspretor Syspretor force-pushed the enhancement/improve-runtime-controller-in-large-scale-scenarios branch 2 times, most recently from 3082e72 to 1f4e010 Compare February 21, 2025 01:58
Signed-off-by: jiuyu <guotongyu.gty@alibaba-inc.com>
@Syspretor Syspretor force-pushed the enhancement/improve-runtime-controller-in-large-scale-scenarios branch from 1f4e010 to bb0ebbd Compare February 21, 2025 02:12
@sonarqubecloud
Copy link

@Syspretor Syspretor requested a review from cheyang February 21, 2025 02:36
return
}

duration, err := strconv.Atoi(RuntimeReconcileDurationEnvVal)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this function is called frequently (such as during each Reconcile), repeatedly parsing the string can waste CPU resources. I suggest doing this once in init function.

return finishTime.Sub(creationTime).Round(time.Second).String()
}

func GenerateRandomRequeueDurationFromEnv() (needReconcile bool, d time.Duration) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenerateRandomRequeueDurationFromEnv -> GenerateRandomRequeueDuration

return
}
r := rand.New(rand.NewSource(time.Now().UnixNano()))
randomDurationValue := (r.Intn(2*offset+1) + duration - offset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reusing the global math/rand instance to avoid repeatedly creating rand.Source?Reuse the global math/rand instance directly to avoid repeatedly creating rand.Source and rand.Rand.

@cheyang
Copy link
Collaborator

cheyang commented Feb 21, 2025

@Syspretor Thank you providing the useful solution. I think it's better to also provide documents for the end users.

Copy link
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@fluid-e2e-bot
Copy link

fluid-e2e-bot bot commented Feb 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cheyang cheyang merged commit 7e917c7 into fluid-cloudnative:master Feb 23, 2025
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants