Make auto downsample factor calculation more robust by StSav012 · Pull Request #2253 · pyqtgraph/pyqtgraph

StSav012 · 2022-04-13T15:33:19Z

The proposed changes enable ignoring the infinite and NaN values of x and avoiding a division by zero if self.opts['autoDownsampleFactor'] == 0.

The proposed changes enable ignoring the infinite and NaN values of `x` and avoiding a division by zero if `self.opts['autoDownsampleFactor'] == 0`.

j9ac9k · 2022-04-20T16:13:06Z

Hi @StSav012

Thanks for the PR, this will definitely make an improvement. I'm a little concerned with the equality check at the bottom of your diff

if width*self.opts['autoDownsampleFactor'] != 0.0 and np.isfinite(x1-x0):

I know the code you're replacing was if width != 0.0 but I'm wondering if you think using something like not math.isclose() would be more appropriate here. I'm ok with merging as is, but wanted to get your thoughts on the equality check first.

StSav012 · 2022-04-20T16:51:37Z

Hi @j9ac9k

It's a good point. I'm not confident how to tell a sufficient tolerance from a bad one. After your question, I'd rather test that (x1-x0) / (width*self.opts['autoDownsampleFactor']) is still finite. This way there is no need for any tolerance. Kind of like this:

                    if width * self.opts['autoDownsampleFactor'] != 0.0:
                        ds_float = (x1 - x0) / (width * self.opts['autoDownsampleFactor'])
                        if np.isfinite(ds_float):
                            ds = int(max(1, int(ds_float)))

And do we require both ints at the line above?

j9ac9k · 2022-04-20T17:22:10Z

width should always be of type float, and autoDownsampleFactor can be float or integer. autoDownsampleFactor should be > 1.0, so we shouldn't run into any weirdness, so this is likely a non-issue.

Also for singular values, I generally prefer to use math.isfinite vs. numpy.isfinite but there was an issue some time ago where it turns out the two behave differently (I should try and search the issue/pr tracker for more specifics).

pyqtgraph#2253 (comment)

j9ac9k · 2022-05-02T21:56:52Z

pyqtgraph/graphicsItems/PlotDataItem.py

                    width = self.getViewBox().width()
-                    if width != 0.0:
-                        ds = int(max(1, int((x1-x0) / (width*self.opts['autoDownsampleFactor']))))
+                    if width != 0.0:  # autoDownsampleFactor _should_ be > 1.0


Hi @StSav012

I think for this equality comparison to 0.0, I think we should use math.isclose(width, 0.0, abs_tol=1e-9)

With a not in front of it.

Hi, @j9ac9k @outofculture

I can't agree with the approach you propose. Not yet, at least.

The check for infinity takes care of the cases when width is not exactly zero, but close to it. I'm afraid, with isclose, the behavior would not be straightforward; it would depend on the tolerance picked essentially at random. Say, if the x span is huge (like 1e308), we might still get the infinity, even if width is significantly non-zero. However, if width is exactly 0.0, the function fails with ZeroDivisionError, and the rest of the function gets ignored. On the other hand, if view_range.right() - view_range.left() (why do we need x0 and x1 at all?) is small (say, equal to the isclose tolerance), the picked tolerance might be far too rough.

My understanding is that width here wasn't in the view range coordinates, but corresponds to QGraphicsWidget.geometry().width() which should return the QRectF that represents the position and size of the widet, and not the view space.

That said, the tolerance picked effectively at random so you're not wrong there...

Hmm. Looking further, QRect.width() should be an integer, so I don't think this protection will ever matter either way. We could wrap it in a try/except ZeroDivisionError: pass, even, if that's all we're guarding against, with the added possible bonus of protecting against self.opts['autoDownsampleFactor'] == 0, too. That leaves the question of whether we need to treat e.g. 1 / ((0.3 - 0.2) - 0.1) as a divide-by-zero, which we would normally want to do. In this case, if a application sets self.opts['autoDownsampleFactor'] to e.g. 1e-15, they're going to get bad behavior ("please up-sample my data"?), so we can probably safely ignore this aspect of the problem. Our knee-jerk resistance to comparing floats to specific values likely doesn't matter either way here, but the sentiment is a good one, so it might be worthwhile to put in the isclose just to uphold that principle.

Hmm. Looking further, QRect.width() should be an integer, so I don't think this protection will ever matter either way.

QRect.width() should absolutely be an integer, but we're dealing w/ a QRectF here ;)

https://doc-snapshots.qt.io/qt6-dev/qgraphicslayoutitem.html#geometry

We could wrap it in a try/except ZeroDivisionError: pass, even, if that's all we're guarding against, with the added possible bonus of protecting against self.opts['autoDownsampleFactor'] == 0, too.

nit-pick level 100; if we do that we should use

import contextlib ... with contextlib.suppress(ZeroDivisionError): # do the division here

That leaves the question of whether we need to treat e.g. 1 / ((0.3 - 0.2) - 0.1) as a divide-by-zero, which we would normally want to do. In this case, if a application sets self.opts['autoDownsampleFactor'] to e.g. 1e-15, they're going to get bad behavior ("please up-sample my data"?), so we can probably safely ignore this aspect of the problem. Our knee-jerk resistance to comparing floats to specific values likely doesn't matter either way here, but the sentiment is a good one, so it might be worthwhile to put in the isclose just to uphold that principle.

Yeah, @StSav012 is modifying existing code here that has some code I absolutely hate, and I'm requesting they change something else that was already in the library; probably not the most fair ask. I'm good w/ leaving it for this PR, and addressing the float equality comparisons in another PR.

j9ac9k · 2022-05-02T21:58:14Z

Hi @StSav012

I chatted w/ @outofculture today, and we agreed on the equality comparison to 0.0 being problematic and settled on a tolerance we felt was appropriate. If youi don't mind making that change, I'll merge this right after 👍🏻

Consider the following lines: ```python x0 = (view_range.left()-finite_x[0]) / dx x1 = (view_range.right()-finite_x[0]) / dx ``` And later, ```python ds_float = max(1.0, (x1 - x0) / (width * self.opts['autoDownsampleFactor'])) ``` See that `(x1 - x0)` holds the width of `view_range`, divided by `dx`. The values of `x0` and `x1` are not used anywhere else. The values of `x0` and `x1` might be infinite after the division by `dx`, whereas the difference might be finite. So, firstly, I calculate the difference `view_range.right() - view_range.left()`, and then I divide it by `dx`. The initial code utilizes the direction of the axis implicitly, having `(x1 - x0)` always non-negative. I enclose the expression into `abs` to ensure the same behavior.

j9ac9k · 2022-08-28T01:24:04Z

Closing/Opening to trigger new CI run.

j9ac9k · 2022-08-28T01:44:43Z

CI all green, merging, sorry for the wait @StSav012 thanks for the PR!

Mriv31 · 2023-05-03T21:36:05Z

np.isfinite is actually extremely slow and is needlessly called each time we recompute the ds parameter (ie update the viewrange).

If you have a very large datasets (in my case 10^7 pts) and are navigating into small portions of it using cliptoview and autodownsample, you do not want to recompute ```np.isfinite`` of the whole dataset everytime you change the x range.

In my case removing the isfinite check allows a gain of 0.4s when updating the region of my dataset I am looking at. Replacing the two occurences of searchsorted in the whole dataset removes another 0.4s (but only works if x proportional to the index).

Those two functions are the main performance problem when zooming in and navigating into very large datasets.

Maybe isfinite be checked only once for a given dataset - upon loading ? and only finite values kept for what happens next. What do you thing ?

Make auto downsample factor calculation more robust

9411167

The proposed changes enable ignoring the infinite and NaN values of `x` and avoiding a division by zero if `self.opts['autoDownsampleFactor'] == 0`.

Check against infinity before casting to int

5322a2c

Simplified non-zero check, used math

0c158ef

pyqtgraph#2253 (comment)

j9ac9k reviewed May 2, 2022

View reviewed changes

j9ac9k closed this Aug 28, 2022

j9ac9k reopened this Aug 28, 2022

j9ac9k merged commit befa5de into pyqtgraph:master Aug 28, 2022

Uh oh!

Conversation

StSav012 commented Apr 13, 2022

Uh oh!

j9ac9k commented Apr 20, 2022

Uh oh!

StSav012 commented Apr 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

j9ac9k commented Apr 20, 2022

Uh oh!

j9ac9k May 2, 2022

Choose a reason for hiding this comment

Uh oh!

outofculture May 3, 2022

Choose a reason for hiding this comment

Uh oh!

StSav012 May 3, 2022

Choose a reason for hiding this comment

Uh oh!

j9ac9k May 3, 2022

Choose a reason for hiding this comment

Uh oh!

outofculture May 3, 2022

Choose a reason for hiding this comment

Uh oh!

j9ac9k May 3, 2022

Choose a reason for hiding this comment

Uh oh!

j9ac9k commented May 2, 2022

Uh oh!

j9ac9k commented Aug 28, 2022

Uh oh!

j9ac9k commented Aug 28, 2022

Uh oh!

Mriv31 commented May 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

StSav012 commented Apr 20, 2022 •

edited

Loading