Skip to content

Conversation

@ryanfox
Copy link
Contributor

@ryanfox ryanfox commented Jan 14, 2014

Fix for issue #1942. Per http://www.mail-archive.com/numpy-discussion@scipy.org/msg02446.html, __reduce__ and __setstate__ must completely specify the state of the object to be (un)pickled. I overrode those methods to account for the extra attribute in Longitude. I also added a basic test class for pickling/unpickling coordinates.

@astrofrog
Copy link
Member

It looks like this should be combined with #1947, which doesn't include a fix for Longitude

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one should use super in cases like these, so that if people make a subclass like class myclass(Longitude, AnotherClass), it will continue to work. So, I would suggest replacing this with something along the lines of

object_state = list(super(Longitude, self).__reduce__())
object_state[2] = (object_state[2], self.wrap_angle)

Note that of course now one does not have to include self._unit any more in the pickle, since the super call goes to Quantity, which already includes it (and which should of course also do a super). In __setstate__, one would then have

super_state, own_state = state
super(Longitude, self).__setstate__(super_state)
self.wrap_angle = own_state

@mhvk
Copy link
Contributor

mhvk commented Jan 14, 2014

@ryanfox - as an alternative to the bits of hassle gone through here: It seems to me that all attributes that one might want to pickle are stored in self.__dict__. So, one could also simply pickle everything in there (in Quantity), which may just do the trick generally. However, I have to admit I understand neither __dict__ nor pickle well enough to be sure (in particular, why would this not be done by default...).

Example:

In [1]: from astropy.coordinates import Longitude

In [2]: l = Longitude([10.,11.,12.], 'deg')

In [3]: l
Out[3]: <Longitude [ 10., 11., 12.] deg>

In [4]: l.__dict__
Out[4]: {'_unit': Unit("deg"), '_wrap_angle': <Angle 360.0 deg>}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: probably best to pickle and restore _wrap_angle (i.e., the internal state); I don't think one wants to call a setter while restoring state.

@mhvk
Copy link
Contributor

mhvk commented Jan 14, 2014

p.s. Looking at the pickle documentation [1], I see that indeed by default the contents of __dict__ get pickled. Is it clear why this doesn't work for Quantity/Longitude? Is it perhaps overriden by numpy? I ask mostly because it suggests there might be a simpler solution, one that would require fewer additions of __reduce__ and __setstate__. Do we have an expert pickler? (@mdboom? @eteq?)

[1] http://docs.python.org/2/library/pickle.html#the-pickle-protocol

@ryanfox
Copy link
Contributor Author

ryanfox commented Jan 14, 2014

@mhvk, I originally implemented it with super, but thought it looked nicer without the de-tupling and re-tupling.

Also I would have thought using the setter would be better as it goes through _wrap_internal(). Does that have unintended side effects?

@mhvk
Copy link
Contributor

mhvk commented Jan 14, 2014

@ryanfox - I think one should try to always use super since the code should not presume the superclasses (their order is overridden in the example I gave). E.g., as part of trying to have table columns handle quantity (subclasses), I define new column classes like LongitudeColumn(Longitude, Column); the way this works is that if you then call super inside Longitude, it first tries to pass control to Column -- which may have to do its own work for pickling, in which case another super there would pass off to ndarray.

As for the setter, it is perhaps a matter of opinion, but my sense is that the pickler's goal should only be to be able to store and retrieve a given object, it should not do any sanity checks, especially not when unpickling (one really doesn't want to have to deal with an error in unpickling because the internal state was inconsistent when the object was pickled).

@eteq
Copy link
Member

eteq commented Jan 16, 2014

I agree wtih @mhvk that super is important here - it's one of the astropy code guidelines to use that where possible, for exactly the reasons @mhvk mentions.

@eteq
Copy link
Member

eteq commented Jan 16, 2014

Oh, but it would be nice to hear from @mdboom if there might be a "correct" way to pickle Quantitys?

@ryanfox
Copy link
Contributor Author

ryanfox commented Jan 17, 2014

I reworked my changes to utilize super. It appears numpy implements __reduce__, which wipes out any chance of pickling just __dict__, so we're stuck implementing __reduce__ and __getstate__.

About sanity checks: I'll buy you should get exactly the same object back out that you put in. Unless there's some compelling reason I don't know about, I think the internal state will suffice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git hub view complains about the lack of a newline here (I have set up emacs to complain about such things; see http://docs.astropy.org/en/latest/development/codeguide_emacs.html)

@mhvk
Copy link
Contributor

mhvk commented Jan 17, 2014

@ryanfox - this looks good. I'm still slightly wondering whether it wouldn't be better to store a dictionary as additional information, perhaps even __dict__ itself. With that, setting back state becomes trivial (indeed, __reduce__ and __setstate__ might have to be defined for Quantity only). On the other hand, the way you have it has much less risk for surprises, so arguably that means it is better.

@mdboom, could you have a look as well?

@mdboom
Copy link
Contributor

mdboom commented Jan 17, 2014

I've made a PR against this one (ryanfox#1) that implements @mhvk's suggestion to just store the whole dict along with the Numpy-specific informaton in reduce. At least locally, the tests still pass. This means subclasses of Quantity don't have to worry about anything special to support pickling, unless they're doing something funky.

@mhvk
Copy link
Contributor

mhvk commented Jan 17, 2014

@mdboom - I was indeed thinking along those lines. One possible issue I see for the future is when I would, e.g., subclass from both Column and Quantity (as in #1839). Since both are ndarray subclasses, I worry slightly the different __reduce__ and __setstate__ may start to conflict, something I think may be less of a problem with the direct implementation by @ryanfox. On the other hand, I like the simplicity of just pickling __dict__. Also, supposing that Column just did exactly the same, I think there should not, in fact, be a problem. So, perhaps just do this for now.

(Alternatively, just as there is a case for having our own wrapped version of MaskedArray -- #1857 -- we could perhaps think of having our own general ndarray subclass, which would handle the pickling of __dict__ for Quantity, Column, and any other subclass we may think of later. But that's definitely outside of the scope here.)

@mdboom
Copy link
Contributor

mdboom commented Jan 17, 2014

@mhvk: That all makes sense, particularly the possibility of having a local astropy ndarray subclass to contain some of this sort of stuff. Maybe we should make a new issue for that?

@mhvk
Copy link
Contributor

mhvk commented Jan 17, 2014

@mdboom - maybe not quite worth its own issue, but I added it to #1857, so that it isn't forgotten.

For now, @ryanfox, if you agree @mdboom's patch is OK, could you merge it into your branch? This should trigger an automatic check by travis, and assuming that passes, this can be merged.

@embray
Copy link
Member

embray commented Jan 21, 2014

I haven't been following this closely so I'm not sure about all the ins and outs. But definitely use super() if you need to call any of the superclasses's methods. But note also if you need to override things like __reduce__ you can always just implement a __reduce__ that doesn't call the superclasses at all if conflicts arise. Though it means you need to reimplement any machinery to make it work.

mhvk added a commit that referenced this pull request Jan 21, 2014
Fix for #1942.  Added pickling test class for coordinates as well
@mhvk mhvk merged commit 5df4a9f into astropy:master Jan 21, 2014
@mhvk
Copy link
Contributor

mhvk commented Jan 21, 2014

@ryanfox - thanks! Looks all OK to me, so I'll merge it.

mhvk added a commit to mhvk/astropy that referenced this pull request Jan 21, 2014
mdboom added a commit that referenced this pull request Jan 21, 2014
CHANGES.rst entry for #1961 (Longitude keeping wrap_angle)
mhvk added a commit that referenced this pull request Feb 12, 2014
Fix for #1942.  Added pickling test class for coordinates as well
mdboom added a commit that referenced this pull request Feb 12, 2014
CHANGES.rst entry for #1961 (Longitude keeping wrap_angle)
mhvk added a commit that referenced this pull request Feb 12, 2014
Fix for #1942.  Added pickling test class for coordinates as well
mdboom added a commit that referenced this pull request Feb 12, 2014
CHANGES.rst entry for #1961 (Longitude keeping wrap_angle)
mhvk added a commit that referenced this pull request Feb 14, 2014
Fix for #1942.  Added pickling test class for coordinates as well
mdboom added a commit that referenced this pull request Feb 14, 2014
CHANGES.rst entry for #1961 (Longitude keeping wrap_angle)
ktchrn pushed a commit to ktchrn/astropy that referenced this pull request Oct 28, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants