Skip to content

max and min for float32 cupy array may be slow  #2085

@xu3kev

Description

@xu3kev
  • Conditions (you can just paste the output of python -c 'import cupy; cupy.show_config()')
    • CuPy version = commit 2146ce2
    • OS/Platform = Ubuntu 16.04/ V100
    • CUDA version = 10.0
  • Code to reproduce
import cupy as cp
from contextlib import contextmanager

@contextmanager
def sync_time(name):
    start = cp.cuda.Event()
    end = cp.cuda.Event()
    start.record()
    start.synchronize()
    yield
    end.record()
    end.synchronize()
    t = cp.cuda.get_elapsed_time(start,end)
    print("{} : {} ms".format(name,t))

x = cp.random.normal(size=((400, 32, 28, 28))).astype(cp.float32)

with sync_time("cupy"):
    for i in range(1000):
        x.max()

x = cp.asnumpy(x) #move to cpu

with sync_time("numpy"):
    for i in range(1000):
        x.max()
  • Results
cupy : 8457.2451171875 ms
numpy : 3005.8154296875 ms

In this case, Cupy is slower then Numpy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cat:performancePerformance in terms of speed or memory consumptionpr-ongoing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions