-
Notifications
You must be signed in to change notification settings - Fork 7.4k
Slower performance when passing numpy array into task versus list. #1813
Copy link
Copy link
Closed
Description
import numpy as np
import ray
import time
ray.init()
@ray.remote
class Foo(object):
def method(self, x):
return x
a = Foo.remote()
x = np.random.rand(10, 10).tolist()
time.sleep(1) # Wait for the actor to start.
start = time.time()
for i in range(1000):
ray.get(a.method.remote(x))
print("Using list: ", time.time() - start)
x = np.random.rand(10, 10)
start = time.time()
for i in range(1000):
ray.get(a.method.remote(x))
print("Using numpy array: ", time.time() - start)On my laptop, this prints
Using list: 0.6015908718109131
Using numpy array: 0.895500898361206
The numpy array case is slower (presumably because the array does not get inlined in the task specification and goes through the object store instead).
Proposal:
- Allow small numpy arrays to be inlined in the tasks.
- Allow larger things to be inlined in the tasks.
Potential Issues:
- The bigger the tasks are, the sooner Redis will run out of memory (until we are flushing keys from Redis).
cc @jsuarez5341
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels