[RLlib] Fix device check in `Learner`. by simonsays1980 · Pull Request #53706 · ray-project/ray

simonsays1980 · 2025-06-10T17:40:43Z

Why are these changes needed?

In the Learner is is checked, if the training batch is already on the correct device. For this it is checked, if

the "obs" column of the batch is a numpy array or
in case that it is already a tensor (other cases should not be possible at this stage of the pipeline) if the tensor is already on the correct device.

This can throw an exception, if the observations are no an array-type (e.g. gym.spaces.Dict, gym.spaces.Tuple). This PR fixes this by switching the column of the batch to be checked to a simpler one, namely the "rewards" (which are always of simple vector format).

Related issue number

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Copilot

Pull Request Overview

This PR fixes the device check in the Learner by switching the validation from the "obs" column to the "rewards" column, ensuring that the correct device property is checked for training data.

Updated the import to include the Columns constant.
Replaced "obs" with Columns.REWARDS in the device check condition.

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

sven1977

LGTM! Thanks for the fix @simonsays1980 !

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

Changed column obs to column rewards in device check.

f8666ff

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

simonsays1980 marked this pull request as ready for review June 10, 2025 17:41

Copilot AI review requested due to automatic review settings June 10, 2025 17:41

simonsays1980 requested a review from a team as a code owner June 10, 2025 17:41

Merge branch 'master' into fix-device-check-in-learner

1cb968d

Copilot AI reviewed Jun 10, 2025

View reviewed changes

simonsays1980 added 2 commits June 11, 2025 17:41

Changed check in learner to use 'tree.flatten'.

52b33b1

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

Removed unused imports.

84c6d9c

Signed-off-by: simonsays1980 <simon.zehnder@gmail.com>

simonsays1980 requested a review from sven1977 June 11, 2025 16:49

sven1977 enabled auto-merge (squash) June 13, 2025 11:50

github-actions bot added the go add ONLY when ready to merge, run all tests label Jun 13, 2025

sven1977 approved these changes Jun 13, 2025

View reviewed changes

sven1977 merged commit c7436b7 into ray-project:master Jun 13, 2025
7 checks passed

elliot-barn pushed a commit that referenced this pull request Jun 18, 2025

[RLlib] Fix device check in Learner. (#53706)

bef19f2

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

elliot-barn pushed a commit that referenced this pull request Jul 2, 2025

[RLlib] Fix device check in Learner. (#53706)

f82298f

Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Fix device check in `Learner`.#53706

[RLlib] Fix device check in `Learner`.#53706
sven1977 merged 4 commits intoray-project:masterfrom
simonsays1980:fix-device-check-in-learner

simonsays1980 commented Jun 10, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

sven1977 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

simonsays1980 commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

sven1977 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simonsays1980 commented Jun 10, 2025 •

edited

Loading