[Data] Added tracking of block serialization time #60574
[Data] Added tracking of block serialization time #60574alexeykudinkin merged 19 commits intomasterfrom
Conversation
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Updated `task_completion_time_excl_backpressure_s` to include block serde time Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
…des serde time; Updated tests; Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces tracking for block serialization time, which is a valuable addition for performance monitoring in Ray Data. The core change involves modifying the generator execution flow in _raylet.pyx to feed back serialization duration to the caller using gen.send(). This new metric is then integrated throughout the data stack, including BlockExecStats, OpRuntimeMetrics, and relevant operators like MapOperator and HashShuffleOperator. The tests have also been updated to cover these new metrics.
The implementation is clean and effective. The use of yield expressions to pass data back into generators is a good pattern for this use case. I have one minor suggestion to improve code clarity.
| def udf_time_s(self, reset: bool) -> float: | ||
| cur_time_s = self._udf_time_s | ||
| self._udf_time_s = 0 | ||
| return cur_time_s |
There was a problem hiding this comment.
The reset parameter in udf_time_s is not used in the function body; the timer self._udf_time_s is always reset to 0. Since the only call site in map_operator.py passes reset=True, the behavior is correct for now. However, to improve clarity and prevent potential misuse in the future, I suggest removing the reset parameter from the method signature and updating the call in map_operator.py:772 to map_transformer.udf_time_s().
| def udf_time_s(self, reset: bool) -> float: | |
| cur_time_s = self._udf_time_s | |
| self._udf_time_s = 0 | |
| return cur_time_s | |
| def udf_time_s(self) -> float: | |
| cur_time_s = self._udf_time_s | |
| self._udf_time_s = 0 | |
| return cur_time_s |
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
edoakes
left a comment
There was a problem hiding this comment.
From offline discussion, core change looks ok to me. Let's consider it a private API for now. I will follow up to add a disclaimer & context in code comments.
Defer to others to review the data changes. You may want to consider an alternative name to "serialization time" since it is not purely serialization, but also includes memory allocation & copy time. Perhaps "block_write_time_s" or "block_output_time_s"
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
|
|
||
| @dataclass(frozen=True) | ||
| class StreamingGeneratorStats: | ||
| object_creation_dur_s: float |
There was a problem hiding this comment.
Can we rename this to object_serialization_time_s
There was a problem hiding this comment.
See the conversation above. This is not just serialization time, it includes object serialization time
| self._init_fn = init_fn if init_fn is not None else lambda: None | ||
| self._output_block_size_option_override = output_block_size_option_override | ||
| self._udf_time = 0 | ||
| self._udf_time_s = 0 |
There was a problem hiding this comment.
Nit: Maybe a type var to denote secs?
There was a problem hiding this comment.
That's what _s prefix is for
| # Yield block and retrieve its Ray object serialization timing | ||
| stats: StreamingGeneratorStats = yield block | ||
| if stats: | ||
| exec_stats.block_ser_time_s = stats.object_creation_dur_s |
There was a problem hiding this comment.
when operators are fused, can we assert in that test this value is 0?
There was a problem hiding this comment.
That won't be 0
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: 400Ping <jiekaichang@apache.org>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Sirui Huang <ray.huang@anyscale.com>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: Adel Nour <ans9868@nyu.edu>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Changes --- 1. Modified Ray Core's generator handling sequence to inject back object creation & serialization durations 2. Updated `task_completion_time_excl_backpressure_s` to track both UDF block generation time AND block serialization overhead ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Alexey Kudinkin <ak@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Changes
task_completion_time_excl_backpressure_sto track both UDF block generation time AND block serialization overheadRelated issues
Additional information