-
Notifications
You must be signed in to change notification settings - Fork 21
Closed
Labels
optimisationIncreases performance (hopefully)Increases performance (hopefully)
Description
hist uses bin when more than 1 dimension is involved. If there are many auxiliary coordinates that do not participate in the bin or subsequent hist operation then bin has to handle them, i.e., copy all the elements, etc. This can become costly:
import scipp as sc
da = sc.data.table_xyz(100_000_000)
da.variances = da.values
da.masks["mask1"] = da.coords["y"] > 0.5 * sc.Unit("m")
da.masks["mask2"] = da.coords["z"] > 0.5 * sc.Unit("m")
dummy = [f"dummy{i}" for i in range(10)]
for name in dummy:
da.coords[name] = da.coords["x"].copy()
x = sc.linspace('x', 0, 1, 14*32+1, unit='m')
da.hist(x=x, y=100) # 2 s
da.drop_coords(dummy).hist(x=x, y=100) # 1 sIt should be simple to avoid this by changing the implementation of hist on the Python side.
Metadata
Metadata
Assignees
Labels
optimisationIncreases performance (hopefully)Increases performance (hopefully)
Type
Projects
Status
Done