Skip to content

[Data] Fix limit operator#34800

Merged
raulchen merged 6 commits intoray-project:masterfrom
raulchen:fix-limit
Apr 27, 2023
Merged

[Data] Fix limit operator#34800
raulchen merged 6 commits intoray-project:masterfrom
raulchen:fix-limit

Conversation

@raulchen
Copy link
Copy Markdown
Contributor

@raulchen raulchen commented Apr 26, 2023

Why are these changes needed?

Fixes a bug introduced by #34705

Related issue number

#34234

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen raulchen requested review from c21, ericl and scv119 as code owners April 26, 2023 21:14
@raulchen raulchen changed the title [Data] Fix limit operator [WIP][Data] Fix limit operator Apr 26, 2023
@ericl
Copy link
Copy Markdown
Contributor

ericl commented Apr 26, 2023

It seems like the limit tests are still broken:


FAILED python/ray/data/tests/test_consumption.py::test_limit_no_redundant_read[10-1]
  | FAILED python/ray/data/tests/test_consumption.py::test_limit_no_redundant_read[20-2]
  | FAILED python/ray/data/tests/test_consumption.py::test_limit_no_redundant_read[30-3]
  | FAILED python/ray/data/tests/test_consumption.py::test_limit_no_redundant_read[60-6]
  | FAILED python/ray/data/tests/test_consumption.py::test_limit_no_num_row_info


@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Apr 26, 2023
Signed-off-by: Hao Chen <chenh1024@gmail.com>
BlockMetadata(
num_rows=None,
size_bytes=None,
size_bytes=0,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't change this. I get the following error.
I believe it is a bug of the test itself.
The previous implemented streaming execution for "limit" and just triggered this bug.
size_bytes isn't supposed to be None in the first place.

@ericl @c21 could you help confirm?

python/ray/data/datastream.py:2276: in take
    for row in self.iter_rows():
python/ray/data/iterator.py:234: in iter_rows
    for batch in self.iter_batches(**iter_batch_args):
python/ray/data/iterator.py:167: in iter_batches
    block_iterator, stats, blocks_owned_by_consumer = self._to_block_iterator()
python/ray/data/_internal/iterator/iterator_impl.py:33: in _to_block_iterator
    block_iterator, stats, executor = ds._plan.execute_to_iterator()
python/ray/data/_internal/plan.py:512: in execute_to_iterator
    block_iter = itertools.chain([next(gen)], gen)
python/ray/data/_internal/execution/legacy_compat.py:51: in execute_to_legacy_block_iterator
    bundle_iter = execute_to_legacy_bundle_iterator(
python/ray/data/_internal/execution/legacy_compat.py:80: in execute_to_legacy_bundle_iterator
    dag, stats = _get_execution_dag(
python/ray/data/_internal/execution/legacy_compat.py:141: in _get_execution_dag
    dag, stats = _to_operator_dag(plan, allow_clear_input_blocks)
python/ray/data/_internal/execution/legacy_compat.py:174: in _to_operator_dag
    operator = _blocks_to_input_buffer(blocks, owns_blocks)
python/ray/data/_internal/execution/legacy_compat.py:235: in _blocks_to_input_buffer
    output = _block_list_to_bundles(blocks, owns_blocks=owns_blocks)
python/ray/data/_internal/execution/legacy_compat.py:359: in _block_list_to_bundles
    RefBundle(
<string>:7: in __init__
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = RefBundle(blocks=[(ObjectRef(00ffffffffffffffffffffffffffffffffffffff0100000001000000), BlockMetadata(num_rows=None, s...None, schema=None, input_files=[], exec_stats=None))], owns_blocks=False, output_split_idx=None, _cached_location=None)

    def __post_init__(self):
        for b in self.blocks:
            assert isinstance(b, tuple), b
            assert len(b) == 2, b
            assert isinstance(b[0], ray.ObjectRef), b
            assert isinstance(b[1], BlockMetadata), b
            if b[1].size_bytes is None:
>               raise ValueError(
                    "The size in bytes of the block must be known: {}".format(b)
                )
E               ValueError: The size in bytes of the block must be known: (ObjectRef(00ffffffffffffffffffffffffffffffffffffff0100000001000000), BlockMetadata(num_rows=None, size_bytes=None, schema=None, input_files=[], exec_stats=None))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why the limit operator would trigger the bug? Is it because the previous is not using streaming executor, and right now we are using it?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so - https://github.com/ray-project/ray/blob/master/python/ray/data/_internal/plan.py#L764 . So the theory makes sense.

Can we put a meaningful number here? such as 4 * n?

Copy link
Copy Markdown
Contributor

@c21 c21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@raulchen
Copy link
Copy Markdown
Contributor Author

test_limit_no_redundant_read is still failing. This test checks that ray.data.read_datasource(...).limit(1) will limit the number of read tasks being run. The reason why it could pass before is also because this test goes to the old execution backend. For the streaming backend, this functionality is not implemented yet.

I think I can skip this test for now and re-enable it after implementing the optimization for streaming executor?

@c21
Copy link
Copy Markdown
Contributor

c21 commented Apr 26, 2023

This looks a regresson now with new limit operator impl.
Ok from me to skip, as long as we work on implementing the optimization for streaming executor as P0 ASAP. Thanks.

Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen
Copy link
Copy Markdown
Contributor Author

This looks a regresson now with new limit operator impl.
Ok from me to skip, as long as we work on implementing the optimization for streaming executor as P0 ASAP. Thanks.

Sure. Will prioritize it.

@ericl
Copy link
Copy Markdown
Contributor

ericl commented Apr 27, 2023

Just one lint error.

Signed-off-by: Hao Chen <chenh1024@gmail.com>
@raulchen raulchen changed the title [WIP][Data] Fix limit operator [Data] Fix limit operator Apr 27, 2023
@raulchen
Copy link
Copy Markdown
Contributor Author

CI failures should be unrelated. Merging.

@raulchen raulchen merged commit f419788 into ray-project:master Apr 27, 2023
@raulchen raulchen deleted the fix-limit branch April 27, 2023 20:55
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
## Why are these changes needed?

Fixes a bug introduced by ray-project#34705

## Related issue number

ray-project#34234

Signed-off-by: Jack He <jackhe2345@gmail.com>
architkulkarni pushed a commit to architkulkarni/ray that referenced this pull request May 16, 2023
## Why are these changes needed?

Fixes a bug introduced by ray-project#34705 

## Related issue number

ray-project#34234
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

@author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants