-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Closed
Labels
docsAn issue or change related to documentationAn issue or change related to documentationenhancementRequest for new feature and/or capabilityRequest for new feature and/or capability
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Description
There's no documentation that explains how to use Ray Rlib Serve with Tuple observation. The example provided uses the cartpole model that has an array as observation, but there are no examples with Tuple observations.
Use case
We are training an RL model using ray. The model trains without any issues using PPOTrainer, but when we try to serve the policy using the tutorial the JSON is empty and the response code is 500. I believe the issue is that we are not setting the correct observation. The observation space of the model is a Tuple, I cannot find any documentation that explains how to serve a model that has Tuple observation space.
This is the code we are using to serve the policy:
import gym
from starlette.requests import Request
import requests
from main import WordleEnv
import ray
import ray.rllib.agents.ppo as ppo
from ray import serve
@serve.deployment(route_prefix="/test")
class ServePPOModel:
def __init__(self, checkpoint_path) -> None:
self.trainer = ppo.PPOTrainer(
env="my_env",
)
self.trainer.restore(checkpoint_path)
async def __call__(self, request: Request):
json_input = await request.json()
obs = json_input["observation"]
action = self.trainer.compute_action(obs)
return {"action": int(action)}
if __name__ == '__main__':
from ray import tune
tune.register_env("my_env", lambda config: WordleEnv())
serve.start()
ServePPOModel.deploy("my_checkpoint")
# That's it! Let's test it
for _ in range(10):
env = WordleEnv()
obs = env.reset()
print(f"-> Sending observation {obs}")
obs_ = [o.tolist() for o in obs]
print(obs_)
resp = requests.get(
"http://localhost:8000/test", json={"observation": obs_}
)
print(f"<- Received response {resp.json()}")
This is how the observation space looks:
self.observation_space = Tuple([
MultiBinary(len(list(WordleEnv.letter_dict.values()))),
MultiBinary(len(list(WordleEnv.letter_dict.values()))),
MultiBinary(len(list(WordleEnv.letter_dict.values()))),
MultiBinary(len(list(WordleEnv.letter_dict.values()))),
MultiBinary(len(list(WordleEnv.letter_dict.values())))
])
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
docsAn issue or change related to documentationAn issue or change related to documentationenhancementRequest for new feature and/or capabilityRequest for new feature and/or capability