Conversation
astropy/utils/misc.py
Outdated
| from .. import units as u | ||
| import numpy as np | ||
| if isinstance(obj, (np.ndarray, np.number)): | ||
| if isinstance(obj, np.number): |
There was a problem hiding this comment.
Is there a reason why you didn't just put Quantity as the first check and retain isinstance(obj, (np.ndarray, np.number)) as-is as the second check?
astropy/utils/misc.py
Outdated
| return json.JSONEncoder.default(self, obj) | ||
|
|
||
|
|
||
| def jsondumps(obj, **kwargs): |
There was a problem hiding this comment.
I would say not. This just obscures the calling code (effectively hiding the astropy JsonCustomEncoder).
| assert json.dumps({1: 2}, | ||
| cls=misc.JsonCustomEncoder) == '{"1": 2}' # default | ||
|
|
||
| assert json.dumps({1: u.m}, cls=misc.JsonCustomEncoder) == '{"1": "m"}' |
There was a problem hiding this comment.
What about composite unit (e.g., fnu) and log unit (e.g., abmag)?
|
Also, this will need a change log once the milestone is set. |
| if isinstance(obj, np.number): | ||
| return obj.tolist() | ||
| elif isinstance(obj, u.Quantity): | ||
| return dict(value=obj.value, unit=obj.unit.to_string()) |
There was a problem hiding this comment.
Is this dict called recursively? If so, there would seem to be no need to use to_string on the unit -- that would automatically get done in the recursion.
astropy/utils/misc.py
Outdated
| return obj.decode() | ||
| elif isinstance(obj, u.UnitBase): | ||
| return obj.name | ||
| elif obj is u.dimensionless_unscaled: |
There was a problem hiding this comment.
Since isinstance(u.dimensionless_unscaled, u.UnitBase) is true, this needs to be rewritten (and tested!). Since it also is not a unique unit, I would write:
elif isinstance(obj, (u.UnitBase, u.FunctionUnitBase)):
if obj == u.dimensionless_unscaled:
obj = 'dimensionless_unit'
else:
obj = obj.to_string()
(note the inclusion of function units here -- we do need #3650... -- as well as the use of to_string instead of name, which I think is more reliable).
|
Beyond the specific comments: is there something that reads in the |
|
As far as Astropy code base in concerned, |
|
@mhvk , for Cone Search, yes, what I write out can be loaded back simply using |
|
Before I dig into the detailed comments (which are most appreciated), let me explain I have a number of applications which generate nested dicts that include astropy Quantity's and similar objects. I have thus far used my own approach to the custom encoding but adding these few lines will achieve it, and may be quite useful to others. ps. I do have code that reads in the dumps output. Happy to add that in too. |
|
@profxj - I think this would be very useful to have more generally (which is indeed why I think it is important to think it through somewhat carefully). E.g., |
|
I actually don't know if it really applies to FITS or "big data" in any way because JSON (like ASCII) is quite inefficient compared to binary formats (FITS, HDF5, or even You just have to check what it does do numpy arrays: It would certainly be handy to have some more support in JSON-encoders and decoders but the really nice step forward would be to use something like HDF5 instead. But that's very subjective and I have to admit I don't even know what this |
|
@MSeifert04 - I mostly felt is was relevant since fits does store meta data in ascii headers. I'm not quite sure what hdf5 does, but at some point we'll need to address that too. Aside: I definitely don't want to hold this simple PR hostage to these discussions, just think it is good to have a few more eyes look over the general direction. |
|
Depending on how involved the decoding stuff is, this might actually need to be moved to somewhere less obscure than |
|
This is timely as I am actively working on lossless serialization of key astropy classes in support of serializing arbitrary Table objects (including mixins) to ECSV. This is done with YAML and includes SkyCoord, Time, np.ndarray, Quantity, Unit, Location (and whatever else might pop up as a dependency). The key thing here for lossless round-tripping is that you need to identify the class of the object. This is conveniently done with a YAML tag, e.g. So I think this will all work and be ready for 1.3 based on my initial proof of concept work. This particular solution using JSON does not allow unique round-tripping. I.e. a |
I wonder whether classes should maybe have a |
If we were to take this route then we might want to stick with load/dump, which are used (with slight variations) by pickle, yaml, and json. But there isn't quite a perfect symmetry. This operation is always meaningful: But in the loading step there is no reason to expect the input to contain only members of one class, and So you always need to load with an astropy-wide (not-class-specific) loader, e.g. This might have a |
|
@taldcroft - regarding |
|
@astrofrog - agreed that we should stick with the recommended Python API as you said. Sorry for the distraction of just making something up. In terms of this PR and the broader context of JSON support, should we impose a requirement that there be complementary and unambiguous encode / decode functionality for any supported object? Unfortunately the current code does not meet that requirement... |
|
@taldcroft , if by current code, you include the stuff I originally added for Cone Search, whatever you do, please make sure it is backward compatible. I don't feel like messing with what is already working for Cone Search. |
|
Sorry for the delay in responding to the extensive comments. I believe that JSON is not a good way to serialize astropy objects in a manner There are times, however, when one wants a simple ASCII file to view and If you decide it isn't worth the bother, that's ok. I just think it makes sense to |
|
@profxj - I agree that the new dump functionality in However, I would propose removing the |
astropy/utils/misc.py
Outdated
| return json.JSONEncoder.default(self, obj) | ||
|
|
||
|
|
||
| def convert_quantity_in_dict(idict): |
There was a problem hiding this comment.
Per comment in main section, remove this function and associated testing.
There was a problem hiding this comment.
Removed. I agree that it is quite specialized.
pllim
left a comment
There was a problem hiding this comment.
I would also recommend squashing all your commits into one. Otherwise, LGTM since remote tests passed. Thanks!
astropy/utils/tests/test_misc.py
Outdated
| assert json.dumps({1: u.m}, cls=misc.JsonCustomEncoder) == '{"1": "m"}' | ||
| # Quantities | ||
| tmp = json.dumps({'a': 5*u.cm}, cls=misc.JsonCustomEncoder) | ||
| assert '"unit": "cm"' in tmp # Order of dict keys is unknown here |
There was a problem hiding this comment.
Nitpicking: Two space between code and start of comment (PEP 8). You might want to use pycodestyle or pyflake or something similar to ensure PEP8 adherence. Not critical but would be nice, especially if someone decides to make code style check stricter on Travis CI in the future.
There was a problem hiding this comment.
Adding the white space.
If someone can teach me how to squash, I'll gladly do so. ;)
There was a problem hiding this comment.
http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html is a good resource for me. But if you have trouble with it, one of the maintainers can also do it for you as long as "allow edits from maintainers" box is checked.
astropy/utils/tests/test_misc.py
Outdated
| assert json.dumps({1: u.m}, cls=misc.JsonCustomEncoder) == '{"1": "m"}' | ||
| # Quantities | ||
| tmp = json.dumps({'a': 5*u.cm}, cls=misc.JsonCustomEncoder) | ||
| assert '"unit": "cm"' in tmp # Order of dict keys is unknown here |
There was a problem hiding this comment.
You should be able to do these tests in a more transparent and complete way, something like:
out = json.loads(json.dumps({'a': 5 * u.cm}, cls=misc.JsonCustomEncoder))
assert out == {'a': {'unit': 'cm', 'value': 5.0}}
Even though the key order might vary, two dicts can be compared like shown. So the test requirement is not that some particular text is emitted but rather that the emitted text represents the JSON encoding of a specific Python data structure.
There was a problem hiding this comment.
Done. Am committing without squashing as I am worried I'd
get that wrong. Need to practice a bit with my own work first..
|
I had to implement JSON encoding/decoding of That way, the decoder unambiguously knows to turn this back into a numpy array. Anyways, I implemented |
|
@gregreen - thanks for the offer! One question I have is if you absolutely need JSON instead of YAML. Serializing the core astropy objects to YAML is available in 1.3: http://astropy.readthedocs.io/en/latest/io/misc.html#module-astropy.io.misc.yaml To include the analogous JSON serialization in the astropy core, it should have a similar level of functionality to the YAML code. For instance a |
|
I don't think this is going to make it to v2. I'm removing the milestone. |
|
Hi humans 👋 - this pull request hasn't had any new commits for approximately 8 months. I plan to close this in a month if the pull request doesn't have any new commits by then. In lieu of a stalled pull request, please close this and open an issue instead to revisit in the future. Maintainers may also choose to add If you believe I commented on this issue incorrectly, please report this here. |
|
I think this discussion is veering towards "let's close this PR and try another implementation". Am I correct? Can we close this? |
|
Hi there @profxj 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labelled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃. Everything looks good from my point of view! 👍 If there are any issues with this message, please report them here |
|
At one point, the only reason this wasn't ingested was because I didn't It was never intended to have the same functionality as the YAML code |
|
@profxj , did you check "allow edit from maintainer" box here? If not, could you please check it? |
|
@pllim -- just have. |
|
Argh sorry, rebase problem. I'll fix it. |
|
@profxj -- I have squashed, rebased, and added a change log for you. Please review. @taldcroft , @mhvk -- Do you think this PR is acceptable as-is, moving any extra "nice to have" items to a new issue? Please re-review. |
|
All looks fine to me. Thanks. |
|
I'll leave the final call to @taldcroft, but my overall sense is that this is an easy improvement we might as well make; ideally, we'd use something more like the |
|
Agreed with @mhvk that this modest enhancement should be merged. But on the bigger point I would argue against any further move toward expanding JSON serialization for more astropy classes. YAML is working out quite well and so there would need to be a pretty strong argument for making two similar, but not exactly the same, serialization systems in astropy. |
|
Thanks @profxj ! |
|
My pleasure.
And I'll offer my final word as to why you might want JSON:
I have found for large YAML files (~10Mb) filled with mainly
boring (i.e. non-astropy) objects that the JSON equivalent reads from disk
~50x faster.
|
Adds support for several astropy-specific objects including
units and Quantity.
Also includes a simple method to use in place of json.dumps
for those less accustomed to its syntax: astropy.utils.misc.jsondumps
Includes new tests (passing tests on Travis)