Bug in array assignment when the mask has been hardened

Hello,

**What happened**:

It's quite a particular set of circumstances: For an integer array with a hardened mask, when assigning the numpy masked constant to a range of two or more elements that already includes a masked value, then a datatype casting error is raised.

Everything is fine if instead the mask is soft and/or the array is of floats.

**What you expected to happen**:

The "re-masking" of the already masked element should have worked silently.

**Minimal Complete Verifiable Example**:

```python
import dask.array as da
import numpy as np

# Numpy works:
a = np.ma.array([1, 2, 3, 4], dtype=int)
a.harden_mask()
a[0] = np.ma.masked
a[0:2] = np.ma.masked
print('NUMPY:', repr(a))
# NUMPY: masked_array(data=[--, --, 3, 4],
#              mask=[ True,  True,  True,  True],
#        fill_value=999999,
#             dtype=int64)

# The equivalent dask operations don't work:
a = np.ma.array([1, 2, 3, 4], dtype=int)
a.harden_mask()
x = da.from_array(a)
x[0] = np.ma.masked
x[0:2] = np.ma.masked
print('DASK :', repr(x.compute()))
# Traceback
#     ...
# TypeError: Cannot cast scalar from dtype('float64') to dtype('int64') according to the rule 'same_kind'
```


**Anything else we need to know?**:

I don't fully appreciate what numpy is doing here, other than when the peculiar circumstance are met, it appears to be sensitive as to whether or not it is assigning a scalar or a 0-d array to the data beneath the mask. A mismatch in data types with the assignment value only seems to matter if the value is a 0-d array, rather than a scalar:

```python
a = np.ma.array([1, 2, 3, 4], dtype=int)
a.harden_mask()
a[0] = np.ma.masked_all(())    # 0-d float, different to dtype of a
a[0:2] = np.ma.masked_all(())  # 0-d float, different to dtype of a
print('NUMPY :', repr(a))
# Traceback
#     ...
# TypeError: Cannot cast scalar from dtype('float64') to dtype('int64') according to the rule 'same_kind'
```

```python
a = np.ma.array([1, 2, 3, 4], dtype=int)
a.harden_mask()
a[0] = np.ma.masked_all((), dtype=int)    # 0-d int, same dtype as a
a[0:2] = np.ma.masked_all((), dtype=int)  # 0-d int, same dtype as a
print('NUMPY :', repr(a))
# NUMPY: masked_array(data=[--, --, 3, 4],
#              mask=[ True,  True,  True,  True],
#        fill_value=999999,
#             dtype=int64)
```
This is relevant because dask doesn't actually assign `np.ma.masked` (which is a float), rather it replaces it with  `np.ma.masked_all(())` (https://github.com/dask/dask/blob/2022.05.0/dask/array/core.py#L1816-L1818) when it _should_ replace it with `np.ma.masked_all((), dtype=self.dtype)` instead:

```python
# Dask works here:
a = np.ma.array([1, 2, 3, 4], dtype=int)
a.harden_mask()
x = da.from_array(a)
x[0] = np.ma.masked_all((), dtype=int)
x[0:2] = np.ma.masked_all((), dtype=int)
print('DASK :', repr(x.compute()))
# DASK : masked_array(data=[--, --, 3, 4],
#              mask=[ True,  True, False, False],
#        fill_value=999999)
```

PR to implement this change to follow ...

**Environment**:

- Dask version: 2022.05.0
- Python version: Python 3.9.5 
- Operating System: Linux
- Install method (conda, pip, source): pip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in array assignment when the mask has been hardened #9026

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Bug in array assignment when the mask has been hardened #9026

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions