Skip to content

Conversation

@barentsen
Copy link
Contributor

(Fixes #10859.)

Description

This PR proposes to make the Table API more user-friendly by adding Column.value as an alias for Column.data, so that the underlying array can be accessed in a way that is consistent with the Time and Quantity classes, which provide Time.value and Quantity.value properties.

Example

This change would particularly benefit the user experience when dealing with tables which contain a mix of Time, Quantity, Column and/or MaskedColumn fields. This will be common in TimeSeries objects. For example, I frequently make the following mistake at present:

In [1]: from astropy.timeseries import TimeSeries
   ...: from astropy.time import Time
   ...: from astropy.units import Quantity
   ...: from astropy.table import Column
   ...:
   ...: ts = TimeSeries({'time': Time([50000, 50001, 50002], format='mjd'),
   ...:                  'flux': Quantity([1., 2., 3.], unit='electron/s'),
   ...:                  'flag': Column(['clear', 'clear', 'cloudy'])})

In [2]: ts['time'].value
Out[2]: array([50000., 50001., 50002.])

In [3]: ts['flux'].value
Out[3]: array([1., 2., 3.])

In [4]: ts['flag'].value
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-7973ebc06f9d> in <module>
----> 1 ts['flag'].value

AttributeError: 'Column' object has no attribute 'value'

If we merge this PR, the new behavior will be:

In [4]: ts['flag'].value
Out[4]: array(['clear', 'clear', 'cloudy'], dtype='<U6')

Do we really need this?

I am aware that, in an ideal world, users should not have to worry about accessing .value. Indeed it looks like Column objects work seamlessly with ~all numpy/astropy functions these days (thank you whoever made this happen!!). The situation is different for Time however, e.g. np.median(Time) does not work. And, while Quantity objects are seamless, I believe its .value property will always be a popular (if dangerous) shortcut for silencing conversion errors.

/cc @taldcroft @mhvk @pllim @orionlee

@astropy-bot astropy-bot bot added the table label Oct 29, 2020
@pllim pllim requested review from mhvk and taldcroft October 29, 2020 00:28
@pllim pllim added this to the v4.3 milestone Oct 29, 2020
@mhvk
Copy link
Contributor

mhvk commented Oct 29, 2020

@barentsen - I like this, and as I mentioned, think @taldcroft and I actually discussed it. One thing that needs a slight bit of thought it what should happen for MaskedColumn - should .value return a MaskedArray?

Also, added your (implicit) request for np.median(time) to #8610 - we know how to do it, but just have to get around to it!

@barentsen
Copy link
Contributor Author

barentsen commented Oct 29, 2020

what should happen for MaskedColumn - should .value return a MaskedArray?

This is the current behavior in this PR.

I spent 15 minutes trying to identify an example application where np.array works and np.MaskedArray breaks, e.g. using various scipy functions, and failed to find an example. Admittedly, it's equally difficult to find examples where passing a Column instead of an np.array matters these days.

@mhvk
Copy link
Contributor

mhvk commented Oct 29, 2020

Ah, yes, since you just make it an alias for .data, it of course works the same way. I think this is a good approach. Something discussed with @taldcroft would be whether eventually we'd want to deprecate .data for MaskedColumn, so it could become what it is for MaskedArray, i.e., return a non-masked item (Column in this case). But even if so I think that should be a separate issue/PR - this one is clean and simple as is.

So, I'm in favour and think the PR looks good! But @taldcroft definitely should have a look too.

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @barentsen, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider adding __array__ method to TimeBase to return value?

4 participants