implement rescaleData_blocked#1648
Conversation
|
@j9ac9k , could I trouble you to run benchmarks/time_rescale.py on macOS? Once without changes and once more after making 'darwin' to belong to fast_float_clip (within the file benchmarks/time_rescale.py) |
Hope this helps! |
From the RDB float32 values, macOS benefits from using umath.clip over minmax. (also seen in #1641 (comment)) |
I saw in the files you added you put just a copy/paste of the console output, if you'd like that (so you don't have to manipulate the markdown syntax used to generate tables) let me know. |
|
so... I may have come across a bug here in the most random of manners... I happened to be on this branch when evaluating some potentially old/stale issues to see if I can close them, and then I came across issue #590 which has a really short and sweet example import pyqtgraph as pg
from pyqtgraph.Qt import QtCore, QtGui
import numpy
plt = pg.image(numpy.random.normal(size=(100,100))*10**50)
if __name__ == '__main__':
import sys
if sys.flags.interactive != 1 or not hasattr(pg.QtCore, 'PYQT_VERSION'):
pg.QtGui.QApplication.exec_()which generated the following exception: To be fair, an exception was generated on the master branch as well... but the image displayed... While experimenting here, an error is generated on the master and rescale_blocked branches when the exponent is 20, but no error occurs if the exponential value is 19. Not sure this is actually an error here with this implementation, but something that caught my attention. |
|
change plt = pg.image(numpy.random.normal(size=(100,100))*10**50) to plt = pg.image(numpy.random.normal(size=(100,100))*1e50) and it will work. In [89]: (np.array([1., 2.])*10**50).dtype
Out[89]: dtype('O')
In [90]: (np.array([1., 2.])*1e50).dtype
Out[90]: dtype('float64')
In [94]: (np.array([1., 2.])*(2**64-1)).dtype
Out[94]: dtype('float64')
In [95]: (np.array([1., 2.])*(2**64)).dtype
Out[95]: dtype('O')Probably has to do with the multiplicand being an integer that no longer fits within 64-bits and then some implementation details leak out. |
|
dtype('O') is 'object'. Numpy is desperate enough to pack oversized python ints into the array. Do you think dtype('O') should maybe raise a helpful error message? |
|
For info, I can get the same behavior as master by adding the "refs_ok" flag to nditer. i.e. Image displays. |
|
If that doesn't break anything for the rest of your code, then adding the "refs_ok" flag seems reasonable to me. |
|
@pijyoi if you ever decide to teach some numpy training course sometime, please let me know, I would sign up. I would have never thought to differentiate between In terms of |
|
It is now a fair comparison between Note: if testing on a laptop, remember to plug in the power. |
|
@pijyoi since you're not on |
|
Timings for time_rescale.py executed on the same machine, so the timings are directly comparable.
|
|
I have removed benchmarks/time_rescale.py. import time
import numpy as np
from pyqtgraph.functions import rescaleData, rescaleData_blocked
shape = 3072, 3072
data_uint8 = np.random.randint(256, size=shape, dtype=np.uint8)
data_float32 = np.random.random(shape).astype(np.float32)
niters = 10
for data in [data_uint8, data_float32]:
for alpha in [0.1, 0.25]:
if data.dtype.kind == 'u':
minval, maxval = alpha * 255, (1 - alpha) * 255
else:
minval, maxval = alpha, 1 - alpha
scale = 255 / (maxval - minval)
offset = minval
params = data, scale, offset, np.uint8
t0 = time.perf_counter()
for _ in range(niters):
output = rescaleData(*params)
t1 = time.perf_counter()
print('RD ', data.dtype, alpha, t1 - t0)
t2 = time.perf_counter()
for _ in range(niters):
output = rescaleData_blocked(*params)
t3 = time.perf_counter()
print('RDB', data.dtype, alpha, t3 - t2) |
|
@pijyoi this PR now has some merge conflicts; would you mind resolving them? |
clip limits should be int if data is int
|
This LGTM, I'm going to give @outofculture some time to look over this. |
|
Of course this is broken for CuPy (ugh; the lack of cupy CI is once again felt). I looked into it, and it's not trivial to fix ( see this bug ). Best we can do for now is reference that bug and use the old code with CuPy. |
outofculture
left a comment
There was a problem hiding this comment.
I looked into it, and it's not trivial to fix
Though feel free to prove me wrong in this respect!
pyqtgraph/functions.py
Outdated
| return data | ||
|
|
||
|
|
||
| def rescaleData_blocked(data_in, scale, offset, dtype=None, clip=None): |
There was a problem hiding this comment.
Other than the need for CuPy compatibility, this new version should always be used, yes? This code is meant to completely replace the old version, for all behaviors. Let's make that more explicit, and name things accordingly. We should bring over the docstring at the same time, and we should give some guidance as to which version of rescaleData a user should use.
There was a problem hiding this comment.
Actually, np.nditer would be hardly worth implementing for Cupy. It would be an anti-pattern for CUDA to process block by block.
During the initial implementation, it was indeed meant to be a replacement. However, I can guess from the optional arguments that many functions in pyqtgraph must have grown from their original implementation to add additional functionality. e.g. for rescaleData(), clip argument must have been added for the levels + lut combine usage in ImageItem.py. Having to add support for all these optional functionalities makes it more difficult to optimize for specific use-cases that do have bottlenecks.
pyqtgraph/functions.py
Outdated
| rng = maxVal-minVal | ||
| rng = 1 if rng == 0 else rng | ||
| data = rescaleData(data, scale/rng, minVal, dtype=dtype) | ||
| data = rescaleData_dispatch(xp, data, scale/rng, minVal, dtype, nanMask is not None) |
There was a problem hiding this comment.
You didn't also use the new function for pyqtgraph/graphicsItems/ImageItem.py:431. Is this a case where the new code is legitimately not appropriate? I don't see how that's the case, seeing as it's nearly identical data to what applyLookupTable/makeARGB is dealing with.
I definitely agree that protecting narrow use-case optimizations is something we need to do in this library. Your new version of the code should be the current first choice of any user trying to rescale data. It's completely functionally equivalent, so far as I can tell, and generally performs better. Trying to preemptively protect the code from later pessimizations is still a good idea, but let's do it while still giving the users a best first choice.
Long-term, I think the right way to protect ( here and elsewhere ) is with continuous benchmarking.
There was a problem hiding this comment.
So I guess more explicitly, I recommend you take over the rescaleData function with rescaleData_dispatch ( probably also restoring the signature and just copying the xp = ... code ).
There was a problem hiding this comment.
So I guess more explicitly, I recommend you take over the rescaleData function with rescaleData_dispatch
I have taken over rescaleData() and made _rescaleData_nditer() a private helper function.
You didn't also use the new function for pyqtgraph/graphicsItems/ImageItem.py:431
That's the case where only 256 elements get scaled. But due to the above change, all uses of rescaling within pyqtgraph now go through the same rescaleData().
This PR continues the rescaleData_blocked() function implementation spun off from #1617.
A standalone test file depending only on numpy has been put temporarily in benchmarks/time_rescale.py.
Some improvements have been made to the benchmarking script:
A slowdown in the fits_int32 codepath was found
From #1641 (comment), it was found that:
These findings have not been incorporated yet, pending more verification.