Skip to content

[rllib] bug in rllib.bc.policy.py #1972

@Emily0219

Description

@Emily0219

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Linux Ubuntu 16.04
  • Ray installed from (source or binary):pip
  • Ray version:0.4.0
  • Python version:2.7.12
  • Exact command to reproduce:

Describe the problem

When I use my custom env with BC algorithm, I found an error about action space. This is my action_space.
self.action_space = Box(np.array([0.0, -1.0]), np.array([1.0, 1.0]), dtype=np.float32)
And run the python API and get an error.

Traceback (most recent call last):
File "/home/ran/PycharmProjects/untitled/bctest_P.py", line 302, in
agent = BCAgent(config, 'gazebocar')
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/agent.py", line 93, in init
Trainable.init(self, config, registry, logger_creator)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/tune/trainable.py", line 90, in init
self._setup()
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/agent.py", line 116, in _setup
self._init()
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/bc/bc.py", line 66, in _init
self.registry, self.env_creator, self.config, self.logdir)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/bc/bc_evaluator.py", line 22, in init
env.action_space, config)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/bc/policy.py", line 25, in init
self.setup_loss(action_space)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/bc/policy.py", line 43, in setup_loss
log_prob = self.curr_dist.logp(self.ac)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/ray/rllib/models/action_dist.py", line 86, in logp
0.5 * np.log(2.0 * np.pi) * tf.to_float(tf.shape(x)[1]) -
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/tensorflow/python/ops/math_ops.py", line 962, in binary_op_wrapper
y = ops.convert_to_tensor(y, dtype=x.dtype.base_dtype, name="y")
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 950, in convert_to_tensor
as_ref=False)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1040, in internal_convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/home/ran/.virtualenvs/gym_gazebo/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 883, in _TensorTensorConversionFunction
(dtype.name, t.dtype.name, str(t)))
ValueError: Tensor conversion requested dtype int64 for Tensor with dtype float32: 'Tensor("local/split:0", shape=(?, 2), dtype=float32, device=/job:localhost/replica:0/task:0/device:CPU:0)'

Source code / logs

agent = BCAgent(config, 'gazebocar') for i in range(10): result = agent.train()

I think this code should be changed from:
def setup_loss(self, action_space): self.ac = tf.placeholder(tf.int64, [None], name="ac")
to:
def setup_loss(self, action_space): self.ac = tf.placeholder(tf.float32, [None]+ list(action_space.shape), name="ac")

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions