-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Description
What happened + What you expected to happen
Ray Data currently lacks support for displaying execution plans and does not provide operator-level metrics within those plans.
Specifically:
-
No Standard API for Execution Plans: Ray Data does not offer a standard API (e.g., an explain() method) to visualize the execution plan, encompassing both the logical and physical plans.
-
Limited Plan Visibility: While print(ds) can display aspects of the logical plan, it does not reveal the physical execution plan.
-
Insufficient Operator-Level Metrics: The execution plan view lacks detailed metrics for individual operators. This makes it difficult to assess critical performance aspects such as:
The volume of data processed by each operator.
Whether predicate pushdown optimizations have been successfully applied. -
Coarse-Grained Dataset Statistics: The metrics provided by the Dataset.stats() API are not sufficiently granular. They are reported at the block level and lack comprehensive, global statistics like total input/output row counts.
Operator 1 ReadLance->Filter(NoneType): 200 tasks executed, 200 blocks produced in 29967.49s
* Remote wall time: 92.49ms min, 2.92s max, 1.22s mean, 243.21s total
* Remote cpu time: 7.36ms min, 418.41ms max, 208.63ms mean, 41.73s total
* UDF time: 305.39us min, 1.28s max, 150.0ms mean, 30.0s total
* Peak heap memory usage (MiB): 407.04 min, 683.52 max, 527 mean
* Output num rows per block: 0 min, 1 max, 0 mean, 1 total
* Output size bytes per block: 0 min, 18327 max, 91 mean, 18327 total
* Output rows per task: 0 min, 1 max, 0 mean, 200 tasks used
* Tasks per node: 14 min, 25 max, 20 mean; 10 nodes used
* Operator throughput:
* Ray Data throughput: 3.336949276338555e-05 rows/s
* Estimated single node throughput: 0.00411168768564564 rows/s
Operator 2 limit=20: 200 tasks executed, 200 blocks produced in 29967.49s
* Remote wall time: 92.49ms min, 2.92s max, 1.22s mean, 243.21s total
* Remote cpu time: 7.36ms min, 418.41ms max, 208.63ms mean, 41.73s total
* UDF time: 305.39us min, 1.28s max, 150.0ms mean, 30.0s total
* Peak heap memory usage (MiB): 407.04 min, 683.52 max, 527 mean
* Output num rows per block: 0 min, 1 max, 0 mean, 1 total
* Output size bytes per block: 0 min, 18327 max, 91 mean, 18327 total
* Output rows per task: 0 min, 1 max, 0 mean, 200 tasks used
* Tasks per node: 14 min, 25 max, 20 mean; 10 nodes used
* Operator throughput:
* Ray Data throughput: 3.336949276338555e-05 rows/s
* Estimated single node throughput: 0.00411168768564564 rows/s
Dataset iterator time breakdown:
* Total time overall: 7.87s
* Total time in Ray Data iterator initialization code: 713.05ms
* Total time user thread is blocked by Ray Data iter_batches: 7.15s
* Total execution time for user thread: 1.65ms
* Batch iteration time breakdown (summed across prefetch threads):
* In ray.get(): 150.7us min, 6.41ms max, 636.5us avg, 127.3ms total
* In batch creation: 4.48us min, 4.48us max, 4.48us avg, 4.48us total
* In batch formatting: 26.3us min, 26.3us max, 26.3us avg, 26.3us total
Dataset throughput:
* Ray Data throughput: 3.336949276338555e-05 rows/s
* Estimated single node throughput: 0.00205584384282282 rows/s
Is it because the detailed execution plan and execution metrics were not considered in the design of ray data, or is it just not supported for the time being?
Versions / Dependencies
ray 2.40.0
Reproduction script
None
Issue Severity
None
Metadata
Metadata
Assignees
Labels
Type
Projects
Status