You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is no mechanism in CuPy to give the user (thread local) control over various behaviors in CuPy e.g. how to trade off performance vs debuggability/NumPy compatibility on a routine-level. This is not an issue for most of the routines, but there are cases where this matters.
Allow synchronous CUDA kernel calls to report proper errors (e.g. check ndarray values or device status codes to report errors similar to NumPy) instead of writing special values or have UB. Missing error checks for cuSOLVER calls #2414
Allow slower execution but deterministic results (e.g. CUB).
There is no mechanism in CuPy to give the user (thread local) control over various behaviors in CuPy e.g. how to trade off performance vs debuggability/NumPy compatibility on a routine-level. This is not an issue for most of the routines, but there are cases where this matters.
How about introducing some configuration mechanism to CuPy? This would incur some overhead to existing code but would make CuPy more usable.