Refactoring NDData#2905
Conversation
|
@mwcraig |
There was a problem hiding this comment.
At some point I think we're going to have to reconsider how units in NDData are handled or if it's even appropriate for an NDData object to have a unit. If NDData is going to be a common data container object it's going to also have to support cases where different axes represented by the array dimensions have different units. The obvious example is, say, long slit spectra where one axis will have some dimensions either in pixels or in length (depending on how the WCS is used) while another axis will have some spectral unit (possibly also length in the case of wavelength but possibly also frequency, etc.).
There was a problem hiding this comment.
I'll add--of course an equivalency could be possible too, but I don't necessarily think it should be that every axis have equivalent units...
There was a problem hiding this comment.
@embray - i think that unit here refers to the unit of the data, not of the axes, so I think it should be appropriate for most situations to have a single unit.
There was a problem hiding this comment.
That is correct -- it is like the BUNIT from a FITS header.
There was a problem hiding this comment.
That's a little confusing, and maybe not useful for "generic" data. I don't know though.
There was a problem hiding this comment.
I think it's reasonable to assume that array data might have no unit or a single unit, right? What would it mean for the data to have more than one unit? I know we are trying to make sure NDData is generic, but I would say that saying that the values can have a unit is acceptable. Of course one could have a 3-d cube where each slice is a different thing that has different units, but in that case I would argue that a single NDData is not the right container. We don't want to be that generic :)
There was a problem hiding this comment.
Agreed. Units should be consistent for the whole array.
|
@mwcraig - regarding FlagCollection - I guess it depends if we want to assume anything about flags now we are moving to a more generic data container. It sounded like we agreed that flags doesn't even have to be a property in the base class? |
|
Basically I think we had agreed that we should strip everything out of NDData apart from the property definitions, correct? That is, there should basically be no functionality in the core NDData (apart from using data.mask and data.unit if available). |
|
That was my recollection too -- I'll update tonight or tomorrow. |
|
Other than eventually supporting slicing, I would agree. |
|
Finally got around to pulling out I've left in slicing and the methods (like An alternative is to moving slicing and those properties into |
|
@mwcraig - are you also planning on simplifying the content of the uncertainty and mask properties? I think we had decided that we are going to strip these down to something like: so that Then for slicing I would simply make it that the For clarity, it might be nice to move the arithmetic classes to a separate |
|
@astrofrog -- For The bits about The bulk of Looking over the code that handles that I think it could be simplified and the restriction that the I'll update PR this evening, then we can continue the discussion from there. |
|
@mwcraig - thanks for the comments. Just to be clear, what I'm suggesting here and in the draft APE that you saw is that we completely remove all uncertainty code from the base class. Of course, we could have a mix-in class that enables uncertainty stuff but in the default NDData class, this would not be present (according to the APE). We should discuss these points as part of the APE since the APE makes the implementation path very clear for the base class. |
|
@mwcraig - regarding the shape of the mask, again it depends. For our SpectralCube package, we actually are not using arrays for the mask but lazily evaluated functions - for example we can do |
|
@astrofrog -- that makes sense...I keep forgetting that in the new arrangement I'll include a subclass (maybe in |
|
I don't know that NDData by itself couldn't be used directly. I think it can and should still be useful as a container for arrays and metadata (and uncertainties? I forget if that is still on). It's just that it would be more useful as a base class for more sepcific data. |
|
@mwcraig - now that the APE has been accepted, can you finalize this PR to implement the decisions in the APE? On my side I'll finish up the implementation of the decorator stuff. |
|
@astrofrog - Working on it, will see how far I get today. |
|
@astrofrog -- mostly done, still need doc updates and an I/O mixin (though I suppose that could come later) |
|
Any suggestions for this? [Am currently thinking "wait til later" is the best approach for now] In [4]: class Foo(Quantity, NDDataBase):
...: pass
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-4-bf917ed358f2> in <module>()
----> 1 class Foo(Quantity, NDDataBase):
2 pass
3
TypeError: Error when calling the metaclass bases
metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases |
|
I think you need to define a metaclass for |
|
@mwcraig - I can work on the IO mix-in on Friday since I'm familiar with that part. I'll review the rest of what's here tomorrow. Thanks! |
There was a problem hiding this comment.
The try...except is a little confusing. Why not simply do:
if not hasattr(value, 'uncertainty_type') or not isinstance(value.uncertainty_type, six.string_types):
raise TypeError(...)
?
There was a problem hiding this comment.
Why not simply do:
Don't remember...done in any event.
|
@mwcraig - once that is fixed, an you also rename this issue to remove the WIP? |
|
@astrofrog -- fyi, am making progress on I/O mixin, will push today. |
|
@astrofrog -- my attempt at an i/o mixin has been pushed. Some caveats/obvious limitations:
Longer term, I think there is some refactoring to be done to reduce duplicate code between this and |
|
@mwcraig - thanks, I will do a PR to your branch to bring the I/O stuff in line with what Table does. |
|
@mwcraig - if you disagree with the changes in my pull request, then in order to make sure we merge this PR today, please remove the latest IO changes and instead simply add a very simple class NDIOMixin(object):
"""
Mixin class to connect NDData to the astropy input/output registry.
This mixin adds two methods to its subclasses, ``read`` and ``write``.
"""
@classmethod
def read(cls, *args, **kwargs):
"""
Read and parse gridded N-dimensional data and return as an
NDData-derived object.
This function provides the NDDataBase interface to the astropy unified
I/O layer. This allows easily reading a file in the supported data
formats.
"""
return io_registry.read(cls, *args, **kwargs)
def write(self, *args, **kwargs):
"""
Write a gridded N-dimensional data object out in specified format.
This function provides the NDDataBase interface to the astropy unified
I/O layer. This allows easily writing a file in the supported data
formats.
"""
io_registry.write(self, *args, **kwargs)and nothing else (no auto-registration). For Astropy 1.0, affiliated packages will then be able to define and register their own readers/writers for their NDData subclasses if needed. We can then postpone to later the decision of which built-in readers/writers to provide in the core, what their behavior should be, and whether |
|
Sorry for the multiple messages - actually I would suggest that regardless of whether you agree with mwcraig#1 you follow the suggestion in my previous comment and remove the FITS reader/writer to keep this PR simple. Then later today I can open a PR with the changes I am suggesting but we can decide to defer to 1.1 if needed. |
|
I agree with @astrofrog that if at all possible it would be best to bring in any FITS-specific stuff in a separate change. |
There was a problem hiding this comment.
This stuff does look a bit bungled (as you suspected), but I'll have to take a closer look later.
There was a problem hiding this comment.
Nevermind, looks like @astrofrog already addressed this.
|
@astrofrog -- just pushed the minimal version of the mixin you proposed, with no readers/writers registered. Agree with you and @embray that it makes sense to punt on this for now. I'm fine with the rearrange in mcraig#1 but didn't merge because we went simple here. Please ping me if you do a PR and I''ll take a closer look. |
|
@mwcraig - thanks for making the change back to a simpler I/O class. If anyone else has any comments on this, please review this now, otherwise I will merge this later this evening (I will do a final review now). |
There was a problem hiding this comment.
I think gridded should be removed here? (I think when writing the APE we agreed that some day this may be used as a parent class for ungridded data too?)
|
These are my only comments - other than that this is good to go. Thanks for your work on this @mwcraig! |
|
My pleasure; the comments are addressed. |
|
@mwcraig - thanks! I will merge this tomorrow morning if there are no other objections. This implements APE7 so the API has already been discussed, and the implementation looks good to me. |
This PR is a companion to #2855 and also came out of the NDData telecon. The goal is to simplify the base NDData class so that the base class becomes a simple container for data interchange.
The arithmetic methods on the current
NDDataare factored out into anNDarithmeticmixin class. The classNDDataArithmeticis include as a convenience and works as a replacement for the currentNDData(I've check that the current ccdproc tests pass if I make make one change to the code: subclass fromNDDataArithmeticinstead ofNDData).I've also moved the
NDDataarithmetic tests into a separate file.edit 9/30/2014: the
flagsproperty is also removed fromNDData.edit 10/2/2014: Uncertainty and mask implementation moved from
NDDatatoNDDataArithmeticand removed automatic conversion ofNDDatato a numpy array. Updated task list.A rendered version of the documentation changes is here.
Still not done:
FlagCollectionshape (should this be done?)Attempt to makemask,unitpoint to themaskof the data (if it has one) instead of separating the mask from the data, and the same for the unit.@Cadair -- please check to see whether this revision has
wcsworking the way you wanted.cc @astrofrog @cdeil @perrygreenfield @crawfordsm @larrybradley @adamginsburg @bsipocz