Skip to content

Conversation

@jl-wynen
Copy link
Member

Fixes #3648

I disallowed masks in all operands because supporting them is complicated. I think the output result.masks.keys() should be the union of all input masks. But the masks of x and y should not be or-ed. Instead, their elements should be selected based on the condition and then or-ed with the condition's masks.

)

# condition is missing coord
with pytest.raises(sc.DatasetError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest using

Suggested change
with pytest.raises(sc.DatasetError):
with pytest.raises(sc.DatasetError, match='<something>'):

especially since we are trying to detect multiple different exceptions.

apply_where(c_data, x_data, y_data, c_coords, x_coords, y_coords);

if (x_coords.has_value()) {
return py::cast(DataArray(new_data, x_coords.value(), {}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I missed something, but it is not clear to me why we require c_coords to be a superset, but then drop those coords and use only value coords?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this either. I thought that we want to make sure that the condition's coords match the data. But it seemed too restrictive to require completely identical coords. Did I get it the wrong way around? Should the data's coords be a superset of the condition's?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just return the union? Or require that the condition is a variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just return the union?

I guess we could. It's not exactly cheap to do a union of 3 inputs but it can be done.

Or require that the condition is a variable?

I wouldn't do that because it encourages discarding coords from the inputs when building the condition.

Copy link
Member

@SimonHeybrock SimonHeybrock Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just return the union?

I guess we could. It's not exactly cheap to do a union of 3 inputs but it can be done.

I thought we should require that value coords are the same (like you do already), but return the union of condition and value coords? Similar to how regular binary ops do it (there should be helpers for that somewhere).

Comment on lines 129 to 135
# differing values
with pytest.raises(sc.DatasetError):
sc.where(condition, x, y)

# differing values
with pytest.raises(sc.DatasetError):
sc.where(condition, x, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate? And missing match.

@jl-wynen jl-wynen merged commit 1a801ce into main Mar 31, 2025
4 checks passed
@jl-wynen jl-wynen deleted the where-for-dataarrays branch March 31, 2025 14:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support DataArray in scipp.where.

3 participants