Skip to content

Fix auto-expansion of string columns when dealing with masked values#9623

Merged
taldcroft merged 3 commits intoastropy:masterfrom
mhvk:table-auto-expand-strings-correction
Nov 19, 2019
Merged

Fix auto-expansion of string columns when dealing with masked values#9623
taldcroft merged 3 commits intoastropy:masterfrom
mhvk:table-auto-expand-strings-correction

Conversation

@mhvk
Copy link
Contributor

@mhvk mhvk commented Nov 18, 2019

When inserting masked values, #9559 caused any string column to expand to allow "N/A" to be added. But this breaks iers. Here, we ensure that masked values will never cause an expansion of string columns. Writing tests, it also seemed to make sense that to use the mask of whatever is inserted. I think this makes more sense (and seemingly there was no test for this).

fixes #9600

@mhvk mhvk added table Bug Affects-dev PRs and issues that do not impact an existing Astropy release utils.iers labels Nov 18, 2019
@mhvk mhvk requested review from pllim and taldcroft November 18, 2019 20:59
@mhvk mhvk added this to the v4.0 milestone Nov 18, 2019
@astropy-bot astropy-bot bot added the utils label Nov 18, 2019
Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to @taldcroft but I would like to note that one should inspect the numpy-dev job on Travis CI that is allowed to fail.

Thanks for the fix, @mhvk !

@mhvk
Copy link
Contributor Author

mhvk commented Nov 18, 2019

The allowed-to-fail job passed, so all looks OK. And of course both @taldcroft and I should have paid attention to that job in #9559...

@taldcroft
Copy link
Member

Writing tests, it also seemed to make sense that to use the mask of whatever is inserted.

Yikes, that was pretty bad!

@taldcroft
Copy link
Member

I think that the two cases below should give consistent results and that the output dtype should be U3 (str3) in both cases. For MaskedColumn we do support unmasking, so the data underneath any masks should be considered for the auto-expansion:

In [2]: c = MaskedColumn(['a', 'b'])
In [3]: c2 = c.insert(1, ['ccc', 'dd'], mask=[True, False])
In [4]: c2.mask[1] = False
In [5]: c2
Out[5]: 
<MaskedColumn dtype='str3' length=4>
  a
ccc
 dd
  b

In [6]: c2 = c.insert(1, np.ma.MaskedArray(['ccc', 'dd'], mask=[True, False]))
In [8]: c2.mask[1] = False
In [9]: c2
Out[9]: 
<MaskedColumn dtype='str2' length=4>
 a
cc
dd
 b

if values is np.ma.masked:
values = ''
else:
values = values.filled('')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
values = values.filled('')
values = values.data

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the one suggestion, and with that I would approve this.

@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

Yes, that is indeed better; only for np.ma.masked do we really not have any information on what the underlying data could be. It also meant the changes could be simplified beyond what you suggested, making this a very small change.

@taldcroft
Copy link
Member

When inserting masked values, #9559 caused any string column to expand to allow "N/A" to be added.

I think this is good, but I can't figure out how to reproduce this problem on master. Hint please?

@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

@pllim had set the IERS test to xfail just to get master to pass again...

@pllim
Copy link
Member

pllim commented Nov 19, 2019

Yes, sorry. You need to remove this line here, which this PR has already done.

@pytest.mark.xfail(reason='See issue 9600')

@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

In this PR, that line is already removed - so the fact that tests pass is relevant 😄

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@taldcroft taldcroft merged commit 5a8132a into astropy:master Nov 19, 2019
@taldcroft
Copy link
Member

When inserting masked values, #9559 caused any string column to expand to allow "N/A" to be added.

BTW I finally understand... (slow!). I think the "N/A" was a red-herring for me, but now I see that the original IERS problem is simply because of directly inserting np.ma.masked into a MaskedArray of strings. And of course this is covered exactly in the new test so all good.

@mhvk mhvk deleted the table-auto-expand-strings-correction branch November 19, 2019 20:40
@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

No worries - I had exactly the same misunderstanding.

@taldcroft
Copy link
Member

BTW, I'm wondering if this needs a bug fix entry. It actually fixes a real bug in (previous) master unrelated to strings and #9559:

>>> c = MaskedColumn([1, 2])
>>> c.insert(1, np.ma.masked)
<MaskedColumn dtype='int64' length=3>
1
0  <-- Should be -- masked
2

@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

Yes, you're right. Will create a quick PR with a changelog entry.

mhvk added a commit to mhvk/astropy that referenced this pull request Nov 19, 2019
@mhvk
Copy link
Contributor Author

mhvk commented Nov 19, 2019

See #9635

taldcroft added a commit that referenced this pull request Nov 19, 2019
…on-changelog

Belated changelog entry for #9623 [ci skip]
bsipocz pushed a commit that referenced this pull request Dec 1, 2019
Fix auto-expansion of string columns when dealing with masked values
bsipocz pushed a commit that referenced this pull request Dec 1, 2019
…on-changelog

Belated changelog entry for #9623 [ci skip]
maxnoe pushed a commit to maxnoe/astropy that referenced this pull request Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Affects-dev PRs and issues that do not impact an existing Astropy release Bug table utils.iers utils

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TST: TestIERS_Auto.test_simple fails with ValueError

3 participants