[Data] Fix ProgressBar with use_ray_tqdm#59996
Conversation
Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request correctly fixes the ProgressBar to honor the use_ray_tqdm setting from DataContext, especially in non-interactive terminals. The logic change is sound and the accompanying test updates are thorough. I've added a few comments to further improve code quality and maintainability, including a suggestion to move an import to the top level for better code organization, adding a missing assertion in a test for completeness, and refactoring duplicated logic in the tests to make them more maintainable.
|
|
||
| self._use_logging = False | ||
|
|
||
| from ray.data.context import DataContext |
There was a problem hiding this comment.
| assert pb._use_logging is False | ||
| else: | ||
| # Without tqdm_ray, falls back to logging | ||
| assert pb._bar is None |
There was a problem hiding this comment.
This assertion is incomplete. When use_ray_tqdm is False in a non-interactive terminal, _use_logging is set to True in the ProgressBar constructor. You should also assert this behavior to make the test more robust.
| assert pb._bar is None | |
| assert pb._bar is None | |
| assert pb._use_logging is True |
| if enable_tqdm_ray: | ||
| # tqdm_ray works in non-interactive contexts | ||
| assert pb._bar is not None | ||
| assert pb._use_logging is False | ||
| else: | ||
| # Without tqdm_ray, falls back to logging | ||
| assert pb._bar is None | ||
| assert pb._use_logging is True | ||
|
|
||
| # Reset mock to clear the "progress bar disabled" log call | ||
| mock_logger.info.reset_mock() | ||
|
|
||
| # Update progress - should log | ||
| pb.update(5) | ||
| # Verify logger.info was called exactly twice with expected messages | ||
| assert mock_logger.info.call_count == 2 | ||
| mock_logger.info.assert_any_call("=== Ray Data Progress {test} ===") | ||
| mock_logger.info.assert_any_call("test: Progress Completed 5 / 10") |
There was a problem hiding this comment.
This test logic is very similar to test_progress_bar_logging_in_non_interactive_terminal_without_total. There's a lot of duplicated code between these two tests due to the if/else structure for enable_tqdm_ray. To improve maintainability and reduce redundancy, consider refactoring this shared logic. You could either combine them into a single test parametrized with total and the expected log string, or extract the common testing logic into a helper function.
bveeramani
left a comment
There was a problem hiding this comment.
I think this PR will reintroduce the issue where Ray Data writes garbled outputs to log files.
Rather than determining what type of progress bar to use in ProgressBar (which could be constructed in an actor), an alternative could be to choose the progress bar type on the driver, and propagate that. I'm not sure how complicated that would be to currently implement, though
| # Exception: tqdm_ray is designed to work in non-interactive contexts | ||
| # (workers/actors) by sending JSON progress updates to the driver. |
There was a problem hiding this comment.
I don't think this is true.
tqdm_ray is designed to not spam the terminal you use a task or actor. It doesn't work with non-interactive terminals (e.g., log files).
For example, here's what I see when I write the output to a log file using this PR:
2026-01-09 10:16:07,305 INFO worker.py:2001 -- Started a local Ray instance.
/Users/balaji/ray/python/ray/_private/worker.py:2040: FutureWarning: Tip: In future versions of Ray, Ray will no longer override accelerator visible devices env var if num_gpus=0 or num_gpus=None (default). To enable this behavior and turn off this error message, set RAY_ACCEL_ENV_VAR_OVERRIDE_ON_ZERO=0
warnings.warn(
2026-01-09 10:16:08,726 INFO logging.py:397 -- Registered dataset logger for dataset dataset_2_0
2026-01-09 10:16:08,734 INFO streaming_executor.py:182 -- Starting execution of Dataset dataset_2_0. Full logs are in /tmp/ray/session_2026-01-09_10-16-05_275216_89693/logs/ray-data
2026-01-09 10:16:08,734 INFO streaming_executor.py:183 -- Execution plan of Dataset dataset_2_0: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadRange->Map(sleep)->Project] -> AggregateNumRows[AggregateNumRows]
2026-01-09 10:16:08,865 WARNING resource_manager.py:134 -- ⚠️ Ray's object store is configured to use only 4.7% of available memory (2.0GiB out of 42.6GiB total). For optimal Ray Data performance, we recommend setting the object store to at least 50% of available memory. You can do this by setting the 'object_store_memory' parameter when calling ray.init() or by setting the RAY_DEFAULT_OBJECT_STORE_MEMORY_PROPORTION environment variable.
2026-01-09 10:16:08,865 INFO streaming_executor.py:661 -- [dataset]: A new progress UI is available. To enable, set `ray.data.DataContext.get_current().enable_rich_progress_bars = True` and `ray.data.DataContext.get_current().use_ray_tqdm = False`.
Running Dataset dataset_2_0.: 0.00 row [00:00, ? row/s]
- ReadRange->Map(sleep)->Project: 0%| | 0.00/1.00 [00:00<?, ? row/s]�[A
- AggregateNumRows: 0%| | 0.00/1.00 [00:00<?, ? row/s]�[A�[A2026-01-09 10:16:08,883 WARNING resource_manager.py:791 -- Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadRange->Map(sleep)->Project]. The job may hang forever unless the cluster scales up.
2026-01-09 10:16:08,905 WARNING utils.py:33 -- Truncating long operator name to 100 characters. To disable this behavior, set `ray.data.DataContext.get_current().DEFAULT_ENABLE_PROGRESS_BAR_NAME_TRUNCATION = False`.
Running Dataset: dataset_2_0. Active & requested resources: 2/10 CPU, 768.0MiB/1.0GiB object store: : 0.00 row [00:01, ? row/s]
Running Dataset: dataset_2_0. Active & requested resources: 2/10 CPU, 768.0MiB/1.0GiB object store: : 0.00 row [00:01, ? row/s]
- ReadRange->...->Project: Tasks: 2 [backpressured:tasks(ConcurrencyCap)]; Actors: 0; Queued blocks: 98 (0.0B); Resources: 2.0 CPU, 768.0MiB object store: 0%| | 0.00/1.00 [00:01<?, ? row/s]�[A
- ReadRange->...->Project: Tasks: 2 [backpressured:tasks(ConcurrencyCap)]; Actors: 0; Queued blocks: 98 (0.0B); Resources: 2.0 CPU, 768.0MiB object store: 0%| | 0.00/1.00 [00:01<?, ? row/s]�[A
- AggregateNumRows: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 0%| | 0.00/1.00 [00:01<?, ? row/s]�[A�[A
- AggregateNumRows: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 0%| | 0.00/1.00 [00:01<?, ? row/s]�[A�[A
Running Dataset: dataset_2_0. Active & requested resources: 10/10 CPU, 80.0B/1.0GiB object store: : 0.00 row [00:02, ? row/s]
Running Dataset: dataset_2_0. Active & requested resources: 10/10 CPU, 80.0B/1.0GiB object store: : 0.00 row [00:02, ? row/s]
- ReadRange->...->Project: Tasks: 10 [backpressured:tasks(ResourceBudget)]; Actors: 0; Queued blocks: 88 (0.0B); Resources: 10.0 CPU, 80.0B object store: 0%| | 0.00/1.00 [00:02<?, ? row/s] �[A
- ReadRange->...->Project: Tasks: 10 [backpressured:tasks(ResourceBudget)]; Actors: 0; Queued blocks: 88 (0.0B); Resources: 10.0 CPU, 80.0B object store: 0%| | 0.00/100 [00:02<?, ? row/s] �[A
- ReadRange->...->Project: Tasks: 10 [backpressured:tasks(ResourceBudget)]; Actors: 0; Queued blocks: 88 (0.0B); Resources: 10.0 CPU, 80.0B object store: 2%|▏ | 2.00/100 [00:02<01:43, 1.06s/ row]�[A
- ReadRange->...->Project: Tasks: 10 [backpressured:tasks(ResourceBudget)]; Actors: 0; Queued blocks: 88 (0.0B); Resources: 10.0 CPU, 80.0B object store: 2%|▏ | 2.00/100 [00:02<01:43, 1.06s/ row]�[A
There was a problem hiding this comment.
Ah, Issue was the progress bar broke when when dataset execution was driver from SplitCoordinator. With this PR, it works now.
ec2-user@ip-172-31-41-154$ pytest -s test_data_integration.py::test_parquet_file_shuffle_with_executions
======================================================== test session starts ========================================================
platform linux -- Python 3.10.19, pytest-8.3.3, pluggy-1.5.0
rootdir: /home/ec2-user/myworkspace/ray
configfile: pytest.ini
plugins: anyio-4.6.0, lazy-fixtures-1.4.0
collected 2 items
test_data_integration.py 2026-01-10 00:48:24,254 INFO worker.py:1992 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
Parquet dataset sampling: 100%|████████████████████████████████████████████████████████████████| 2.00/2.00 [00:00<00:00, 5.01 file/s]
2026-01-10 00:48:25,039 INFO parquet_datasource.py:1046 -- Estimated parquet encoding ratio is 0.235.
2026-01-10 00:48:25,039 INFO parquet_datasource.py:1106 -- Estimated parquet reader batch size at 4007393 rows
Parquet dataset sampling: 100%|█████████████████████████████████████████████████████████████████| 2.00/2.00 [00:00<00:00, 501 file/s]
2026-01-10 00:48:25,459 INFO parquet_datasource.py:1046 -- Estimated parquet encoding ratio is 0.235.
2026-01-10 00:48:25,459 INFO parquet_datasource.py:1106 -- Estimated parquet reader batch size at 4007393 rows
(TrainController pid=2204180) Attempting to start training worker group of size 2 with the following resources: [{'CPU': 1}] * 2
(TrainController pid=2204180) Started training worker group of size 2:
(TrainController pid=2204180) - (ip=172.31.41.154, pid=2204315) world_rank=0, local_rank=0, node_rank=0
(TrainController pid=2204180) - (ip=172.31.41.154, pid=2204316) world_rank=1, local_rank=1, node_rank=0
(SplitCoordinator pid=2204466) Registered dataset logger for dataset train1_2_0
(SplitCoordinator pid=2204466) Starting execution of Dataset train1_2_0. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data
(SplitCoordinator pid=2204466) Execution plan of Dataset train1_2_0: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2204466) ⚠️ Ray's object store is configured to use only 42.9% of available memory (17.6GiB out of 41.1GiB total). For optimal Ray Data performance, we recommend setting the object store to at least 50% of available memory. You can do this by setting the 'object_store_memory' parameter when calling ray.init() or by setting the RAY_DEFAULT_OBJECT_STORE_MEMORY_PROPORTION environment variable.
(SplitCoordinator pid=2204466) [dataset]: A new progress UI is available. To enable, set `ray.data.DataContext.get_current().enable_rich_progress_bars = True` and `ray.data.DataContext.get_current().use_ray_tqdm = False`.
(SplitCoordinator pid=2204466) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2204466) ✔️ Dataset train1_2_0 execution finished in 0.40 seconds: 100%|█████████████████████| 150/150 [00:00<00:00, 369 row/s]
(pid=2204466) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204466) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 972.0B object store; [localit
(SplitCoordinator pid=2204466) ✔️ Dataset train1_2_0 execution finished in 0.40 seconds
(SplitCoordinator pid=2204466) Registered dataset logger for dataset train1_2_1
(SplitCoordinator pid=2204466) Starting execution of Dataset train1_2_1. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data
(SplitCoordinator pid=2204466) Execution plan of Dataset train1_2_1: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2204466) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2204466) ✔️ Dataset train1_2_1 execution finished in 0.11 seconds: 100%|███████████████████| 135/135 [00:00<00:00, 1.39k row/s]
(pid=2204466) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204466) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 3.5KiB object store; [localit
(SplitCoordinator pid=2204466) ✔️ Dataset train1_2_1 execution finished in 0.11 seconds
(SplitCoordinator pid=2204466) Registered dataset logger for dataset train1_2_2
(SplitCoordinator pid=2204466) Starting execution of Dataset train1_2_2. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data
(SplitCoordinator pid=2204466) Execution plan of Dataset train1_2_2: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2204466) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2204466) ✔️ Dataset train1_2_2 execution finished in 0.12 seconds: 100%|███████████████████| 150/150 [00:00<00:00, 1.68k row/s]
(pid=2204466) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204466) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 982.0B object store; [localit
(SplitCoordinator pid=2204466) ✔️ Dataset train1_2_2 execution finished in 0.12 seconds
(SplitCoordinator pid=2204466) Registered dataset logger for dataset train1_2_3
(SplitCoordinator pid=2204466) Starting execution of Dataset train1_2_3. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data
(SplitCoordinator pid=2204466) Execution plan of Dataset train1_2_3: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2204466) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2204466) ✔️ Dataset train1_2_3 execution finished in 0.48 seconds: 100%|█████████████████████| 150/150 [00:00<00:00, 302 row/s]
(pid=2204466) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204466) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 1.6KiB object store; [localit
(SplitCoordinator pid=2204466) ✔️ Dataset train1_2_3 execution finished in 0.48 seconds
(SplitCoordinator pid=2204466) Registered dataset logger for dataset train1_2_4
(SplitCoordinator pid=2204466) Starting execution of Dataset train1_2_4. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data
(SplitCoordinator pid=2204466) Execution plan of Dataset train1_2_4: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2204466) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2204466) ✔️ Dataset train1_2_4 execution finished in 0.06 seconds: 100%|███████████████████| 75.0/75.0 [00:00<00:00, 795 row/s]
(pid=2204466) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204466) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 3.5KiB object store; [localit
(SplitCoordinator pid=2204466) ✔️ Dataset train1_2_4 execution finished in 0.06 seconds
(pid=2204465) ✔️ Dataset train2_4_0 execution finished in 0.09 seconds: 100%|███████████████████| 150/150 [00:00<00:00, 1.46k row/s]
(pid=2204465) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204465) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 972.0B object store; [localit
(pid=2204465) ✔️ Dataset train2_4_1 execution finished in 0.07 seconds: 100%|███████████████████| 75.0/75.0 [00:00<00:00, 784 row/s]
(pid=2204465) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204465) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(pid=2204465) ✔️ Dataset train2_4_2 execution finished in 0.07 seconds: 100%|███████████████████| 135/135 [00:00<00:00, 1.38k row/s]
(pid=2204465) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204465) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 3.5KiB object store; [localit
(pid=2204465) ✔️ Dataset train2_4_3 execution finished in 0.07 seconds: 100%|██████████████████| 75.0/75.0 [00:00<00:00, 113k row/s]
(pid=2204465) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204465) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(pid=2204465) ✔️ Dataset train2_4_4 execution finished in 0.07 seconds: 100%|████████████████████| 135/135 [00:00<00:00, 174k row/s]
(pid=2204465) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2204465) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
.(SplitCoordinator pid=2204465) Registered dataset logger for dataset train2_4_4 [repeated 5x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
(SplitCoordinator pid=2204465) Starting execution of Dataset train2_4_4. Full logs are in /tmp/ray/session_2026-01-10_00-48-21_356543_2203409/logs/ray-data [repeated 5x across cluster]
(SplitCoordinator pid=2204465) Execution plan of Dataset train2_4_4: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)] [repeated 5x across cluster]
(SplitCoordinator pid=2204465) ⚠️ Ray's object store is configured to use only 42.9% of available memory (17.6GiB out of 41.1GiB total). For optimal Ray Data performance, we recommend setting the object store to at least 50% of available memory. You can do this by setting the 'object_store_memory' parameter when calling ray.init() or by setting the RAY_DEFAULT_OBJECT_STORE_MEMORY_PROPORTION environment variable.
(SplitCoordinator pid=2204465) [dataset]: A new progress UI is available. To enable, set `ray.data.DataContext.get_current().enable_rich_progress_bars = True` and `ray.data.DataContext.get_current().use_ray_tqdm = False`.
(SplitCoordinator pid=2204465) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up. [repeated 5x across cluster]
(SplitCoordinator pid=2204465) ✔️ Dataset train2_4_4 execution finished in 0.07 seconds [repeated 5x across cluster]
2026-01-10 00:48:37,551 INFO worker.py:1992 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8265
Parquet dataset sampling: 100%|████████████████████████████████████████████████████████████████| 2.00/2.00 [00:00<00:00, 5.07 file/s]
2026-01-10 00:48:38,325 INFO parquet_datasource.py:1046 -- Estimated parquet encoding ratio is 0.235.
2026-01-10 00:48:38,326 INFO parquet_datasource.py:1106 -- Estimated parquet reader batch size at 4007393 rows
Parquet dataset sampling: 100%|█████████████████████████████████████████████████████████████████| 2.00/2.00 [00:00<00:00, 491 file/s]
2026-01-10 00:48:38,733 INFO parquet_datasource.py:1046 -- Estimated parquet encoding ratio is 0.235.
2026-01-10 00:48:38,733 INFO parquet_datasource.py:1106 -- Estimated parquet reader batch size at 4007393 rows
(TrainController pid=2205761) Attempting to start training worker group of size 2 with the following resources: [{'CPU': 1}] * 2
(TrainController pid=2205761) Started training worker group of size 2:
(TrainController pid=2205761) - (ip=172.31.41.154, pid=2205896) world_rank=0, local_rank=0, node_rank=0
(TrainController pid=2205761) - (ip=172.31.41.154, pid=2205895) world_rank=1, local_rank=1, node_rank=0
(SplitCoordinator pid=2206046) Registered dataset logger for dataset train1_2_0
(SplitCoordinator pid=2206046) Starting execution of Dataset train1_2_0. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data
(SplitCoordinator pid=2206046) Execution plan of Dataset train1_2_0: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2206046) ⚠️ Ray's object store is configured to use only 42.9% of available memory (17.6GiB out of 41.1GiB total). For optimal Ray Data performance, we recommend setting the object store to at least 50% of available memory. You can do this by setting the 'object_store_memory' parameter when calling ray.init() or by setting the RAY_DEFAULT_OBJECT_STORE_MEMORY_PROPORTION environment variable.
(SplitCoordinator pid=2206046) [dataset]: A new progress UI is available. To enable, set `ray.data.DataContext.get_current().enable_rich_progress_bars = True` and `ray.data.DataContext.get_current().use_ray_tqdm = False`.
(SplitCoordinator pid=2206046) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2206046) ✔️ Dataset train1_2_0 execution finished in 0.46 seconds: 100%|█████████████████████| 150/150 [00:00<00:00, 368 row/s]
(pid=2206046) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206046) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 982.0B object store; [localit
(SplitCoordinator pid=2206046) ✔️ Dataset train1_2_0 execution finished in 0.46 seconds
(SplitCoordinator pid=2206046) Registered dataset logger for dataset train1_2_1
(SplitCoordinator pid=2206046) Starting execution of Dataset train1_2_1. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data
(SplitCoordinator pid=2206046) Execution plan of Dataset train1_2_1: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2206046) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2206046) ✔️ Dataset train1_2_1 execution finished in 0.12 seconds: 100%|███████████████████| 150/150 [00:00<00:00, 1.79k row/s]
(pid=2206046) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206046) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 982.0B object store; [localit
(SplitCoordinator pid=2206046) ✔️ Dataset train1_2_1 execution finished in 0.12 seconds
(SplitCoordinator pid=2206046) Registered dataset logger for dataset train1_2_2
(SplitCoordinator pid=2206046) Starting execution of Dataset train1_2_2. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data
(SplitCoordinator pid=2206046) Execution plan of Dataset train1_2_2: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2206046) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2206046) ✔️ Dataset train1_2_2 execution finished in 0.49 seconds: 100%|█████████████████████| 135/135 [00:00<00:00, 332 row/s]
(pid=2206046) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206046) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(SplitCoordinator pid=2206046) ✔️ Dataset train1_2_2 execution finished in 0.49 seconds
(SplitCoordinator pid=2206046) Registered dataset logger for dataset train1_2_3
(SplitCoordinator pid=2206046) Starting execution of Dataset train1_2_3. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data
(SplitCoordinator pid=2206046) Execution plan of Dataset train1_2_3: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2206046) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2206046) ✔️ Dataset train1_2_3 execution finished in 0.07 seconds: 100%|███████████████████| 75.0/75.0 [00:00<00:00, 789 row/s]
(pid=2206046) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206046) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(SplitCoordinator pid=2206046) ✔️ Dataset train1_2_3 execution finished in 0.07 seconds
(SplitCoordinator pid=2206046) Registered dataset logger for dataset train1_2_4
(SplitCoordinator pid=2206046) Starting execution of Dataset train1_2_4. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data
(SplitCoordinator pid=2206046) Execution plan of Dataset train1_2_4: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)]
(SplitCoordinator pid=2206046) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up.
(pid=2206046) ✔️ Dataset train1_2_4 execution finished in 0.07 seconds: 100%|███████████████████| 75.0/75.0 [00:00<00:00, 763 row/s]
(pid=2206046) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206046) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(SplitCoordinator pid=2206046) ✔️ Dataset train1_2_4 execution finished in 0.07 seconds
(pid=2206045) ✔️ Dataset train2_4_0 execution finished in 0.08 seconds: 100%|███████████████████| 150/150 [00:00<00:00, 1.46k row/s]
(pid=2206045) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206045) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 982.0B object store; [localit
(pid=2206045) ✔️ Dataset train2_4_1 execution finished in 0.07 seconds: 100%|██████████████████| 75.0/75.0 [00:00<00:00, 128k row/s]
(pid=2206045) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206045) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(pid=2206045) ✔️ Dataset train2_4_2 execution finished in 0.07 seconds: 100%|███████████████████| 135/135 [00:00<00:00, 1.30k row/s]
(pid=2206045) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206045) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(pid=2206045) ✔️ Dataset train2_4_3 execution finished in 0.07 seconds: 100%|██████████████████| 75.0/75.0 [00:00<00:00, 105k row/s]
(pid=2206045) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206045) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
(pid=2206045) ✔️ Dataset train2_4_4 execution finished in 0.07 seconds: 100%|████████████████████| 150/150 [00:00<00:00, 199k row/s]
(pid=2206045) - ReadParquet: Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 0.0B object store: 100%|█| 150/150 [00
(pid=2206045) - split(2, equal=True): Tasks: 0; Actors: 0; Queued blocks: 0 (0.0B); Resources: 0.0 CPU, 2.9KiB object store; [localit
.(SplitCoordinator pid=2206045) Registered dataset logger for dataset train2_4_4 [repeated 5x across cluster]
(SplitCoordinator pid=2206045) Starting execution of Dataset train2_4_4. Full logs are in /tmp/ray/session_2026-01-10_00-48-34_686687_2203409/logs/ray-data [repeated 5x across cluster]
(SplitCoordinator pid=2206045) Execution plan of Dataset train2_4_4: InputDataBuffer[Input] -> TaskPoolMapOperator[ReadParquet] -> OutputSplitter[split(2, equal=True)] [repeated 5x across cluster]
(SplitCoordinator pid=2206045) ⚠️ Ray's object store is configured to use only 42.9% of available memory (17.6GiB out of 41.1GiB total). For optimal Ray Data performance, we recommend setting the object store to at least 50% of available memory. You can do this by setting the 'object_store_memory' parameter when calling ray.init() or by setting the RAY_DEFAULT_OBJECT_STORE_MEMORY_PROPORTION environment variable.
(SplitCoordinator pid=2206045) [dataset]: A new progress UI is available. To enable, set `ray.data.DataContext.get_current().enable_rich_progress_bars = True` and `ray.data.DataContext.get_current().use_ray_tqdm = False`.
(SplitCoordinator pid=2206045) Cluster resources are not enough to run any task from TaskPoolMapOperator[ReadParquet]. The job may hang forever unless the cluster scales up. [repeated 5x across cluster]
(SplitCoordinator pid=2206045) ✔️ Dataset train2_4_4 execution finished in 0.07 seconds [repeated 5x across cluster]
======================================================== 2 passed in 26.76s =========================================================
ec2-user@ip-172-31-41-154$
|
cc @kyuds since this is adjacent to some code you have context on |
|
@bveeramani for my part: the part about accessing |
- Fix ProgressBar to honor `use_ray_tqdm` in `DataContext`. - Note that `tqdm_ray` is designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
- Fix ProgressBar to honor `use_ray_tqdm` in `DataContext`. - Note that `tqdm_ray` is designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: jeffery4011 <jefferyshen1015@gmail.com>
- Fix ProgressBar to honor `use_ray_tqdm` in `DataContext`. - Note that `tqdm_ray` is designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
- Fix ProgressBar to honor `use_ray_tqdm` in `DataContext`. - Note that `tqdm_ray` is designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
- Fix ProgressBar to honor `use_ray_tqdm` in `DataContext`. - Note that `tqdm_ray` is designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
[Data] Fix ProgressBar with use_ray_tqdm
use_ray_tqdminDataContext.tqdm_rayis designed to work in non-interactive contexts (workers/actors) by sending JSON progress updates to the driver.Related issues
Additional information