-
Notifications
You must be signed in to change notification settings - Fork 21
Description
According to the docstring:
When histogramming a dimension with an existing dimension-coord, the binning for
the dimension is modified, i.e., the input and the output will have the same
dimension labels.When histogramming by non-dimension-coords, the output will have new dimensions
given by the names of these coordinates. These new dimensions replace the
dimensions the input coordinates depend on.
In practice this means that:
- A prior use of
transform_coordswith or without therename_dimsoption affects the outcome of a subsequenthist. - It is possible to indirectly control which dimensions are to be removed, as shown in the example below, by renaming and/or flattening dimensions:
import scipp as sc
table = sc.data.table_xyz(1000)
binned = table.bin(x=3, y=4) # sizes {'x': 3, 'y': 4}
binned.rename_dims(y='z').hist(z=5) # sizes {'x': 3, 'z': 5}
binned.flatten(to='z').hist(z=5) # sizes {'z': 5}The mechanism was introduced since it allows the algorithm to either add a new dimension, or replace an existing dimension. But is it too confusing when working with multi-dimensional data?
Would it suffice to improve the docstring (I am thinking of adding concrete examples on how to control the behavior), or do we need to think of something else?
Note that related functions such as bin are also affected.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status