implement rescaleData_blocked by pijyoi · Pull Request #1648 · pyqtgraph/pyqtgraph

pijyoi · 2021-03-21T09:04:38Z

This PR continues the rescaleData_blocked() function implementation spun off from #1617.
A standalone test file depending only on numpy has been put temporarily in benchmarks/time_rescale.py.

Some improvements have been made to the benchmarking script:

tests both uint8 and float32
ensures that there are values outside both clipping bounds
- this exercises any branch prediction that may be taking place

A slowdown in the fits_int32 codepath was found

for int32 data types, the clip limits provided to the clip function should be integer-valued. otherwise, it will be slower, particularly so on Windows. Probably different versions of the ufunc get selected based on its input arguments.

From #1641 (comment), it was found that:

on Linux, numpy 1.20 runs faster than 1.17, probably due to SIMD implemented on 1.20.
on macOS, numpy 1.20 and 1.17 are equally fast
using umath.minimum(umath.maximum()) instead of umath.clip() is only an improvement on Windows. It is in fact a pessimization to use it on Linux and macOS. (For Linux, rescaleData_blocked() already takes umath.clip() codepath)
These findings have not been incorporated yet, pending more verification.

pijyoi · 2021-03-21T09:08:45Z

@j9ac9k , could I trouble you to run benchmarks/time_rescale.py on macOS? Once without changes and once more after making 'darwin' to belong to fast_float_clip (within the file benchmarks/time_rescale.py)

j9ac9k · 2021-03-21T15:27:13Z

@j9ac9k , could I trouble you to run benchmarks/time_rescale.py on macOS? Once without changes and once more after making 'darwin' to belong to fast_float_clip (within the file benchmarks/time_rescale.py)

f()	dtype	alpha	runtime (standard)	runtime (darwin in fast_float_clip)
RD	uint8	0.1	0.436254924	0.44594586799999997
RDB	uint8	0.1	0.2032192060000001	0.19190403199999995
RD	uint8	0.25	0.42308679299999996	0.41383199300000006
RDB	uint8	0.25	0.18000498899999995	0.19028552100000007
RD	float32	0.1	0.40843685100000027	0.41462677700000006
RDB	float32	0.1	0.452220552	0.39034072399999964
RD	float32	0.25	0.3910386429999999	0.40077267699999997
RDB	float32	0.25	0.45443687899999974	0.39704211099999975

Hope this helps!

pijyoi · 2021-03-21T21:37:42Z

Hope this helps!

From the RDB float32 values, macOS benefits from using umath.clip over minmax. (also seen in #1641 (comment))

j9ac9k · 2021-03-21T21:40:41Z

Hope this helps!

From the RDB float32 values, macOS benefits from using umath.clip over minmax. (also seen in #1641 (comment))

I saw in the files you added you put just a copy/paste of the console output, if you'd like that (so you don't have to manipulate the markdown syntax used to generate tables) let me know.

j9ac9k · 2021-03-22T04:54:29Z

so... I may have come across a bug here in the most random of manners... I happened to be on this branch when evaluating some potentially old/stale issues to see if I can close them, and then I came across issue #590 which has a really short and sweet example

import pyqtgraph as pg
from pyqtgraph.Qt import QtCore, QtGui
import numpy

plt = pg.image(numpy.random.normal(size=(100,100))*10**50) 

if __name__ == '__main__':
    import sys
    if sys.flags.interactive != 1 or not hasattr(pg.QtCore, 'PYQT_VERSION'):
        pg.QtGui.QApplication.exec_()

which generated the following exception:

Traceback (most recent call last):
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/HistogramLUTItem.py", line 224, in imageChanged
    h = self.imageItem().getHistogram()
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/ImageItem.py", line 515, in getHistogram
    mn = self._xp.nanmin(stepData).item()
AttributeError: 'float' object has no attribute 'item'
Traceback (most recent call last):
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/ImageItem.py", line 462, in paint
    self.render()
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/ImageItem.py", line 451, in render
    fn.makeARGB(image, lut=lut, levels=levels, output=self._processingBuffer)
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/functions.py", line 1287, in makeARGB
    data = rescaleData_blocked(data, scale/rng, minVal, dtype=dtype)
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/functions.py", line 1106, in rescaleData_blocked
    it = np.nditer([data_in, data_out],
TypeError: Iterator operand or requested dtype holds references, but the REFS_OK flag was not enabled

To be fair, an exception was generated on the master branch as well... but the image displayed...

Traceback (most recent call last):
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/HistogramLUTItem.py", line 224, in imageChanged
    h = self.imageItem().getHistogram()
  File "/Users/ogi/Developer/pyqtgraph/pyqtgraph/graphicsItems/ImageItem.py", line 515, in getHistogram
    mn = self._xp.nanmin(stepData).item()
AttributeError: 'float' object has no attribute 'item'

While experimenting here, an error is generated on the master and rescale_blocked branches when the exponent is 20, but no error occurs if the exponential value is 19.

Not sure this is actually an error here with this implementation, but something that caught my attention.

pijyoi · 2021-03-22T07:57:08Z

change

plt = pg.image(numpy.random.normal(size=(100,100))*10**50)

to

plt = pg.image(numpy.random.normal(size=(100,100))*1e50)

and it will work.

In [89]: (np.array([1., 2.])*10**50).dtype
Out[89]: dtype('O')

In [90]: (np.array([1., 2.])*1e50).dtype
Out[90]: dtype('float64')

In [94]: (np.array([1., 2.])*(2**64-1)).dtype
Out[94]: dtype('float64')

In [95]: (np.array([1., 2.])*(2**64)).dtype
Out[95]: dtype('O')

Probably has to do with the multiplicand being an integer that no longer fits within 64-bits and then some implementation details leak out.

NilsNemitz · 2021-03-22T08:53:20Z

dtype('O') is 'object'. Numpy is desperate enough to pack oversized python ints into the array.

Do you think dtype('O') should maybe raise a helpful error message?
Because (np.array([1., 2.])*(2**64)) gives
array([1.8446744073709552e+19, 3.6893488147419103e+19], dtype=object)
which looks fine (except for the dtype), and it would be annoying to debug why the array prints, but doesn't plot.

pijyoi · 2021-03-22T10:46:34Z

For info, I can get the same behavior as master by adding the "refs_ok" flag to nditer. i.e. Image displays.
Whether that causes any side-effects, I wouldn't know.

NilsNemitz · 2021-03-22T15:13:54Z

If that doesn't break anything for the rest of your code, then adding the "refs_ok" flag seems reasonable to me.

j9ac9k · 2021-03-22T16:34:54Z

@pijyoi if you ever decide to teach some numpy training course sometime, please let me know, I would sign up.

I would have never thought to differentiate between 10**50 and 1e50

In terms of refs_ok if it were any other part of the library I would say we should probably enable it, but this part of the library we make some decisions based on numpy dtypes here, and I'm not sure having it work with object dtypes is in our best interest.

pijyoi · 2021-03-25T03:03:37Z

It is now a fair comparison between rescaleData() and rescaleData_blocked().
Previously, rescaleData() was being penalized by using a slower clipping path.

Note: if testing on a laptop, remember to plug in the power.

j9ac9k · 2021-03-25T20:26:34Z

@pijyoi since you're not on ~~your~~ our slack, I couldn't thank you directly there, so instead I'm going to post some completely off-topic to this ~~issue post~~ pull request; to explicitly thank you for your contributions that have made the 0.12.0 release pretty awesome. If you're interested in hoping on our slack, let me know and I'll send you an invite.

pijyoi · 2021-03-28T13:27:44Z

Timings for time_rescale.py executed on the same machine, so the timings are directly comparable.
Numpy version is relevant.
Overall, nditer implementation is a win.

	WSL2 numpy 1.20.2	Windows numpy 1.20.2
RD uint8 0.1	0.276985	0.579306
RDB uint8 0.1	0.106553	0.299663
RD uint8 0.25	0.26142	0.865689
RDB uint8 0.25	0.107774	0.302405
RD float32 0.1	0.250691	0.537989
RDB float32 0.1	0.123221	0.441421
RD float32 0.25	0.250870	0.826956
RDB float32 0.25	0.120945	0.746094

	WSL2 numpy 1.17.4	Windows numpy 1.17.5
RD uint8 0.1	0.367519	0.573480
RDB uint8 0.1	0.157658	0.389136
RD uint8 0.25	0.478767	0.861280
RDB uint8 0.25	0.154246	0.481840
RD float32 0.1	0.351139	0.532844
RDB float32 0.1	0.232889	0.439551
RD float32 0.25	0.463790	0.821704
RDB float32 0.25	0.347609	0.733802

pijyoi · 2021-03-29T01:08:11Z

I have removed benchmarks/time_rescale.py.
It is included below (although no longer standalone.)

import time
import numpy as np
from pyqtgraph.functions import rescaleData, rescaleData_blocked

shape = 3072, 3072
data_uint8 = np.random.randint(256, size=shape, dtype=np.uint8)
data_float32 = np.random.random(shape).astype(np.float32)
niters = 10

for data in [data_uint8, data_float32]:
    for alpha in [0.1, 0.25]:
        if data.dtype.kind == 'u':
            minval, maxval = alpha * 255, (1 - alpha) * 255
        else:
            minval, maxval = alpha, 1 - alpha
        scale = 255 / (maxval - minval)
        offset = minval
        params = data, scale, offset, np.uint8

        t0 = time.perf_counter()
        for _ in range(niters):
            output = rescaleData(*params)
        t1 = time.perf_counter()
        print('RD ', data.dtype, alpha, t1 - t0)

        t2 = time.perf_counter()
        for _ in range(niters):
            output = rescaleData_blocked(*params)
        t3 = time.perf_counter()
        print('RDB', data.dtype, alpha, t3 - t2)

j9ac9k · 2021-04-02T03:37:44Z

@pijyoi this PR now has some merge conflicts; would you mind resolving them?

clip limits should be int if data is int

j9ac9k · 2021-04-02T15:37:54Z

This LGTM, I'm going to give @outofculture some time to look over this.

outofculture · 2021-04-02T19:20:17Z

Of course this is broken for CuPy (ugh; the lack of cupy CI is once again felt).

I looked into it, and it's not trivial to fix ( see this bug ). Best we can do for now is reference that bug and use the old code with CuPy.

outofculture

I looked into it, and it's not trivial to fix

Though feel free to prove me wrong in this respect!

outofculture · 2021-04-02T19:27:53Z

pyqtgraph/functions.py

    return data


+def rescaleData_blocked(data_in, scale, offset, dtype=None, clip=None):


Other than the need for CuPy compatibility, this new version should always be used, yes? This code is meant to completely replace the old version, for all behaviors. Let's make that more explicit, and name things accordingly. We should bring over the docstring at the same time, and we should give some guidance as to which version of rescaleData a user should use.

Actually, np.nditer would be hardly worth implementing for Cupy. It would be an anti-pattern for CUDA to process block by block.

During the initial implementation, it was indeed meant to be a replacement. However, I can guess from the optional arguments that many functions in pyqtgraph must have grown from their original implementation to add additional functionality. e.g. for rescaleData(), clip argument must have been added for the levels + lut combine usage in ImageItem.py. Having to add support for all these optional functionalities makes it more difficult to optimize for specific use-cases that do have bottlenecks.

outofculture · 2021-04-02T22:44:04Z

pyqtgraph/functions.py

                rng = maxVal-minVal
                rng = 1 if rng == 0 else rng
-                data = rescaleData(data, scale/rng, minVal, dtype=dtype)
+                data = rescaleData_dispatch(xp, data, scale/rng, minVal, dtype, nanMask is not None)


You didn't also use the new function for pyqtgraph/graphicsItems/ImageItem.py:431. Is this a case where the new code is legitimately not appropriate? I don't see how that's the case, seeing as it's nearly identical data to what applyLookupTable/makeARGB is dealing with.

I definitely agree that protecting narrow use-case optimizations is something we need to do in this library. Your new version of the code should be the current first choice of any user trying to rescale data. It's completely functionally equivalent, so far as I can tell, and generally performs better. Trying to preemptively protect the code from later pessimizations is still a good idea, but let's do it while still giving the users a best first choice.

Long-term, I think the right way to protect ( here and elsewhere ) is with continuous benchmarking.

So I guess more explicitly, I recommend you take over the rescaleData function with rescaleData_dispatch ( probably also restoring the signature and just copying the xp = ... code ).

So I guess more explicitly, I recommend you take over the rescaleData function with rescaleData_dispatch

I have taken over rescaleData() and made _rescaleData_nditer() a private helper function.

You didn't also use the new function for pyqtgraph/graphicsItems/ImageItem.py:431

That's the case where only 256 elements get scaled. But due to the above change, all uses of rescaling within pyqtgraph now go through the same rescaleData().

pijyoi mentioned this pull request Mar 21, 2021

refactor code to replace np.clip #1641

Merged

pijyoi mentioned this pull request Mar 21, 2021

Fix functions.clip_array() #1649

Merged

pijyoi mentioned this pull request Mar 23, 2021

Optimize makeARGB for ubyte images #1617

Merged

pijyoi mentioned this pull request Mar 25, 2021

HistogramLUTItem sigLevelsChanged problem #1512

Open

pijyoi force-pushed the rescale_blocked branch from 9344b33 to 04b6958 Compare March 25, 2021 20:23

pijyoi mentioned this pull request Mar 28, 2021

make ImageItem combine levels + lut not be a pessimization #1668

Closed

pijyoi force-pushed the rescale_blocked branch from 04b6958 to e46146d Compare March 28, 2021 12:49

pijyoi marked this pull request as ready for review March 29, 2021 01:08

pijyoi added 2 commits April 2, 2021 12:50

implement rescaleData_blocked

ba4c812

clip limits should be int if data is int

add test for rescaleData_blocked

11ad335

pijyoi force-pushed the rescale_blocked branch from 29c9870 to 11ad335 Compare April 2, 2021 04:52

outofculture reviewed Apr 2, 2021

View reviewed changes

dispatch to different versions depending on numpy or cupy

27038c0

outofculture reviewed Apr 2, 2021

View reviewed changes

make rescaleData() the only entry-point

45f1f24

outofculture merged commit aa57c7a into pyqtgraph:master Apr 6, 2021

pijyoi deleted the rescale_blocked branch April 7, 2021 08:55

		return data


		def rescaleData_blocked(data_in, scale, offset, dtype=None, clip=None):

Uh oh!

Conversation

pijyoi commented Mar 21, 2021

Uh oh!

pijyoi commented Mar 21, 2021

Uh oh!

j9ac9k commented Mar 21, 2021

Uh oh!

pijyoi commented Mar 21, 2021

Uh oh!

j9ac9k commented Mar 21, 2021

Uh oh!

j9ac9k commented Mar 22, 2021

Uh oh!

pijyoi commented Mar 22, 2021

Uh oh!

NilsNemitz commented Mar 22, 2021

Uh oh!

pijyoi commented Mar 22, 2021

Uh oh!

NilsNemitz commented Mar 22, 2021

Uh oh!

j9ac9k commented Mar 22, 2021

Uh oh!

pijyoi commented Mar 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

j9ac9k commented Mar 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pijyoi commented Mar 28, 2021

Uh oh!

pijyoi commented Mar 29, 2021

Uh oh!

j9ac9k commented Apr 2, 2021

Uh oh!

j9ac9k commented Apr 2, 2021

Uh oh!

outofculture commented Apr 2, 2021

Uh oh!

outofculture left a comment

Choose a reason for hiding this comment

Uh oh!

outofculture Apr 2, 2021

Choose a reason for hiding this comment

Uh oh!

pijyoi Apr 2, 2021

Choose a reason for hiding this comment

Uh oh!

outofculture Apr 2, 2021

Choose a reason for hiding this comment

Uh oh!

outofculture Apr 2, 2021

Choose a reason for hiding this comment

Uh oh!

pijyoi Apr 2, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pijyoi commented Mar 25, 2021 •

edited

Loading

j9ac9k commented Mar 25, 2021 •

edited

Loading