[runtime env] runtime env inheritance refactor by SongGuyang · Pull Request #24538 · ray-project/ray

SongGuyang · 2022-05-06T08:10:21Z

Why are these changes needed?

Recreate for the reverted PR for 2.0.
#22244
#22620

Runtime Environments is already GA in Ray 1.6.0. The latest doc is here. And now, we already supported a inheritance behavior as follows (copied from the doc):

The runtime_env["env_vars"] field will be merged with the runtime_env["env_vars"] field of the parent. This allows for environment variables set in the parent’s runtime environment to be automatically propagated to the child, even if new environment variables are set in the child’s runtime environment.
Every other field in the runtime_env will be overridden by the child, not merged. For example, if runtime_env["py_modules"] is specified, it will replace the runtime_env["py_modules"] field of the parent.

We think this runtime env merging logic is so complex and confusing to users because users can't know the final runtime env before the jobs are run.

Current PR tries to do a refactor and change the behavior of Runtime Environments inheritance. Here is the new behavior:

If there is no runtime env option when we create actor, inherit the parent runtime env.
Otherwise, use the optional runtime env directly and don't do the merging.

You could get the parent runtime env and modify this dict by yourself. Like:
Actor.options(runtime_env=ray.get_runtime_context().runtime_env.update({"X": "Y"}))

Related issue number

#21818

Runtime Environments is already GA in Ray 1.6.0. The latest doc is [here](https://docs.ray.io/en/master/ray-core/handling-dependencies.html#runtime-environments). And now, we already supported a [inheritance](https://docs.ray.io/en/master/ray-core/handling-dependencies.html#inheritance) behavior as follows (copied from the doc): - The runtime_env["env_vars"] field will be merged with the runtime_env["env_vars"] field of the parent. This allows for environment variables set in the parent’s runtime environment to be automatically propagated to the child, even if new environment variables are set in the child’s runtime environment. - Every other field in the runtime_env will be overridden by the child, not merged. For example, if runtime_env["py_modules"] is specified, it will replace the runtime_env["py_modules"] field of the parent. We think this runtime env merging logic is so complex and confusing to users because users can't know the final runtime env before the jobs are run. Current PR tries to do a refactor and change the behavior of Runtime Environments inheritance. Here is the new behavior: - **If there is no runtime env option when we create actor, inherit the parent runtime env.** - **Otherwise, use the optional runtime env directly and don't do the merging.** Add a new API named `ray.runtime_env.get_current_runtime_env()` to get the parent runtime env and modify this dict by yourself. Like: ```Actor.options(runtime_env=ray.runtime_env.get_current_runtime_env().update({"X": "Y"}))``` This new API also can be used in ray client.

SongGuyang · 2022-05-06T09:48:47Z

Hey @xwjiang2010, we will make this change again. Do you already have one UT for the issue #22595?

SongGuyang · 2022-05-06T09:53:19Z

@edoakes @rkooo567 @architkulkarni please help to check if we can make this change for 2.0.

rkooo567 · 2022-05-06T14:25:42Z

cc @edoakes Sorry that i forgot some context. did it pass the API review btw? Should we add an api-review-required tag?

architkulkarni · 2022-05-06T17:21:34Z

cc @edoakes Sorry that i forgot some context. did it pass the API review btw? Should we add an api-review-required tag?

This was the API review issue: #21818 There wasn't any pushback. We also asked for opinions on OSS Slack and I think I remember there were a few thumbs up and no pushback (unfortunately I think the thread is lost because we have limited history.)

architkulkarni

The two PRs mentioned were already approved, so this PR should be approved too. It includes the warning message for users in the case where the behavior is different. The Ray 1.13 branch cut has already happened, so if I understand correctly, it's safe to merge this now and this will only appear in Ray 2.0.

I'll try to run train_pytorch_linear_test directly on this PR, since now we can run release tests for PRs on buildkite. (running here)

edoakes

According to @zhe-thoughts we should wait for the branch-1 to be cut before merging breaking changes, so holding for that.

rkooo567 · 2022-05-12T11:47:36Z

Btw, cc @jjyao this will be one of API changes from 2.0. We should probably track it?

rkooo567

Approving the PR assuming I already approved the same PR before!

jovany-wang · 2022-05-19T02:05:39Z

java/test/src/main/java/io/ray/test/RuntimeEnvTest.java

    }
  }

+  public void testPerActorEnvVars() {


Due to there is no case that one process holds multiple workers, it could be simplified to:

public void testPerActorEnvVars() { try { Ray.init(); { RuntimeEnv runtimeEnv = new RuntimeEnv.Builder() .addEnvVar("KEY1", "A") .addEnvVar("KEY2", "B") .addEnvVar("KEY1", "C") .build(); ActorHandle<A> actor1 = Ray.actor(A::new).setRuntimeEnv(runtimeEnv).remote(); String val = actor1.task(A::getEnv, "KEY1").remote().get(); Assert.assertEquals(val, "C"); val = actor1.task(A::getEnv, "KEY2").remote().get(); Assert.assertEquals(val, "B"); } { /// Because we didn't set them for actor2 , all should be null. ActorHandle<A> actor2 = Ray.actor(A::new).remote(); String val = actor2.task(A::getEnv, "KEY1").remote().get(); Assert.assertNull(val); val = actor2.task(A::getEnv, "KEY2").remote().get(); Assert.assertNull(val); } } finally { Ray.shutdown(); } }

Let's merge this PR after addressing this? @jovany-wang we can just change the code this way here?

This change had been committed at 867ef68

So feel free to merge this PR

SongGuyang · 2022-05-19T03:32:16Z

We will merge this PR because the "ray-1.x" branch has been cut.

ericl · 2022-06-04T21:33:58Z

This PR has caused a P0 regression in master for Tune: #25484

Thinking about the new behavior, it's also fundamentally inconsistent with how Jobs works. When the user specifies a working_dir implicitly via the jobs API, it's very surprising that overriding env_vars also deletes the working dir. The user might not even know about the working dir concept if they're using only jobs.

I think we need to change this to either:

Raise an exception if any fields are dropped (such as working_dir), or
Revert to the previous behavior.

Basically, the user should have to explicitly set {"working_dir": None} to unset the working_dir. Maybe we can also provide a utility to make it easier to generate an updated runtime_env, such as ray.get_runtime_env().with_updates({"env_vars": ...}).

…4538)" This reverts commit eb2692c.

…)" (#25487) This reverts commit eb2692c. This is a temporary mitigation for #25484

SongGuyang added 5 commits May 6, 2022 15:53

remove the API ray.runtime_env.get_current_runtime_env

f5f22a2

add doc

70225e1

fix

2912da2

lint

03bf96e

SongGuyang requested a review from edoakes May 6, 2022 08:10

SongGuyang assigned architkulkarni, edoakes and rkooo567 May 6, 2022

SongGuyang changed the title ~~[runtime env] URI reference refactor~~ [runtime env] runtime env inheritance refactor May 6, 2022

SongGuyang added 2 commits May 6, 2022 16:17

revert

30fd5ad

fix

8c5c62b

SongGuyang added 2 commits May 6, 2022 17:50

fix

e6962ef

fix

faa65a9

architkulkarni approved these changes May 6, 2022

View reviewed changes

edoakes requested changes May 6, 2022

View reviewed changes

SongGuyang added 3 commits May 7, 2022 10:01

fix

2450b70

fix java test

13dd491

move test

72d49d3

rkooo567 approved these changes May 12, 2022

View reviewed changes

Merge branch 'master' into dev_runtime_env_inherit

4ee782f

jovany-wang reviewed May 19, 2022

View reviewed changes

fix

867ef68

jovany-wang approved these changes May 19, 2022

View reviewed changes

edoakes approved these changes May 19, 2022

View reviewed changes

raulchen merged commit eb2692c into ray-project:master May 20, 2022

raulchen deleted the dev_runtime_env_inherit branch May 20, 2022 02:53

ericl mentioned this pull request Jun 4, 2022

Tune / Jobs execution model broken due to change in runtime_env option behavior #25484

Closed

ericl added a commit to ericl/ray that referenced this pull request Jun 5, 2022

Revert "[runtime env] runtime env inheritance refactor (ray-project#2…

b226419

…4538)" This reverts commit eb2692c.

ericl added a commit that referenced this pull request Jun 5, 2022

[hotfix] Revert "[runtime env] runtime env inheritance refactor (#24538…

48acbf0

…)" (#25487) This reverts commit eb2692c. This is a temporary mitigation for #25484

Conversation

SongGuyang commented May 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Uh oh!

SongGuyang commented May 6, 2022

Uh oh!

SongGuyang commented May 6, 2022

Uh oh!

rkooo567 commented May 6, 2022

Uh oh!

architkulkarni commented May 6, 2022

Uh oh!

architkulkarni left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edoakes left a comment

Choose a reason for hiding this comment

Uh oh!

rkooo567 commented May 12, 2022

Uh oh!

rkooo567 left a comment

Choose a reason for hiding this comment

Uh oh!

jovany-wang May 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rkooo567 May 19, 2022

Choose a reason for hiding this comment

Uh oh!

jovany-wang May 19, 2022

Choose a reason for hiding this comment

Uh oh!

SongGuyang commented May 19, 2022

Uh oh!

ericl commented Jun 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

SongGuyang commented May 6, 2022 •

edited

Loading

architkulkarni left a comment •

edited

Loading

jovany-wang May 19, 2022 •

edited

Loading

ericl commented Jun 4, 2022 •

edited

Loading