Skip to content

Add configuration context to dataset#19907

Merged
ericl merged 17 commits intoray-project:masterfrom
ericl:ds-context
Nov 4, 2021
Merged

Add configuration context to dataset#19907
ericl merged 17 commits intoray-project:masterfrom
ericl:ds-context

Conversation

@ericl
Copy link
Copy Markdown
Contributor

@ericl ericl commented Oct 29, 2021

Why are these changes needed?

This adds a globally propagated configuration context to Dataset, where configs can be set on the driver and workers can retrieve them via singleton pattern. This avoids redundant plumbing of configuration / resources like the block owner through developer / public APIs, which may make API stability difficult to maintain.

@ericl ericl requested a review from scv119 as a code owner October 29, 2021 23:48
@ericl ericl changed the title [WIP] Add singleton context to dataset Add configuration context to dataset Oct 30, 2021
@ericl ericl added the do-not-merge Do not merge this PR! label Oct 30, 2021
@ericl ericl changed the title Add configuration context to dataset [WIP] Add configuration context to dataset Oct 30, 2021
@ericl ericl removed the do-not-merge Do not merge this PR! label Nov 3, 2021
@ericl ericl changed the title [WIP] Add configuration context to dataset Add configuration context to dataset Nov 3, 2021
Copy link
Copy Markdown
Contributor

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait for clark's approval as well.

Comment on lines +33 to +34
if _default_context is None:
_default_context = DatasetContext(None, 500 * 1024 * 1024)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to consider thread safety there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah making all Dataset APIs thread safe is a good idea, fixed.

target_max_block_size: int):
"""Private constructor (use get_current() instead)."""
self.block_owner = block_owner
self.target_max_block_size = target_max_block_size
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be in the next PR (can leave it for now).

@ericl ericl added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Nov 4, 2021
@ericl ericl merged commit 585d472 into ray-project:master Nov 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

tests-ok The tagger certifies test failures are unrelated and assumes personal liability.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants