Skip to content

Provide np.broadcast_arrays replacement that allows subclasses#2327

Merged
mhvk merged 8 commits intoastropy:masterfrom
mhvk:util-numpy-broadcast-arrays
Nov 15, 2014
Merged

Provide np.broadcast_arrays replacement that allows subclasses#2327
mhvk merged 8 commits intoastropy:masterfrom
mhvk:util-numpy-broadcast-arrays

Conversation

@mhvk
Copy link
Copy Markdown
Contributor

@mhvk mhvk commented Apr 16, 2014

In the work on the representation classes for coordinates [1], it was realised that np.broadcast_arrays casts all inputs as ndarray, leading to again questions whether we should provide "improved" versions of numpy routines (see #1274). In this particular case, there seemed to be no reason that subclasses would be a problem, so I submitted an enhancement PR to numpy [2]. Since even if merged that will not be available in older versions of numpy, I also made an astropy version, as part of a new astropy.utils.numpy package. I tried to implement a type of organisation that would only override what is not correct, and would also allow us to easily test whether the overrides were still necessary. Probably most needed are comments on this framework!

[1] eteq#10
[2] numpy/numpy#4622

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have to include the full Numpy license up here, or in licenses/NUMPY_LICENSE.rst

@astrofrog
Copy link
Copy Markdown
Member

To play the devil's advocate - do we really need this here, or should we just implement a unit-aware broadcast_arrays like the one I originally wrote in eteq#10? I personally think that it would make more sense to just include the simple one I wrote in astropy/units/numpy_funtions.py (or similar), otherwise backporting numpy code like this will get a little out of hand (and it's not clear that we need it for any other classes at the moment).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing this check by version number, we could just check the condition we're trying to fix. For example:

if not isinstance(np.broadcast_arrays([] * u.m), u.Quantity):

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdboom - yes, like that. Maybe a bit complicated for MaskedArray, but should be do-able. And then of course we can check for the lower version of numpy that we still support whether a given fix is still required.

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 17, 2014

@astrofrog - I'm thinking slightly ahead, to MaskedArray (#1857; hope to upload the work I've done soonish). This would of course be required outside of units. But am definitely open to suggestions.

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 18, 2014

@astrofrog - added the numpy license.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an 'else' clause with the numpy imports?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function only tests whether the numpy function, which is already imported here, is OK. If it is not, then the subsequent if-statement, which calls this function, will cause broadcast_arrays to be overwritten. So if one imports from this module, net either one gets a correct numpy function, or one patched to be OK.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see now - sorry for being slow!

@astrofrog astrofrog modified the milestones: Future, v0.4.0 Apr 18, 2014
@astrofrog
Copy link
Copy Markdown
Member

@mhvk - ok, that makes sense. I think we should wait until the Numpy issue is resolved before we consider merging this though.

@eteq - what is your feeling on this? Should we include it?

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 18, 2014

As for the more general procedure: I think we should certainly only have items in numpy.utils for which a patch has been submitted. I'm fine with waiting for acceptance in this case, though am not sure we will always be able to (I'm thinking of my much more drastic MaskedArray work, which is at four numpy PRs right now -- and counting...).

Even more generally: is this layout good? My goal would be both to have a fairly obvious procedure of how to go about adding a patched numpy function (use same file & location, must have PR, define function with PR name that tests whether the numpy version is patched, etc.), and also to make sure that it will be relatively straightforward to test whether a patch is still necessary (this is why I export the PR to the top; the idea is that one can check with the lowest supported version of numpy whether a given PR is still False; if it is True, the patch can be removed).

I think the above will make it easy to use within astropy; indeed, a follow-up PR would be to use it to remove the fair number of cases where tests for numpy versions are made.

A bit less clear is how to use it outside of it. In principle, in __init__.py, one could do a from numpy import *, and then for code that wants to, one could have import astropy.utils.numpy as np and be done with it. Probably, this aspect is worth bringing up on astropy-dev (the user-facing side maybe is even worth an APE). Though first let me know what you think.

@eteq
Copy link
Copy Markdown
Member

eteq commented Apr 21, 2014

@mhvk - My first reaction is that I think it makes more sense to sequester the actual interface pieces in astropy.utils.compat if possible, because that's exactly what that sub-package is meant for: features that are in existing or future versions of dependencies (or Python itself), but that we don't want to wait for/require a new enough version. So maybe move this to astropy/utils/compat/numpy?

That attitude assumes all these features will be "compatibility", though, instead of "new feature numpy will never do". In the latter case, I'd advocate it living in an astropy-appropriate place, though: e.g., Quantity-related utilities should be in astropy.units. Or do you foresee upcoming similar pieces that don't fit into either of those categories?

Regardless, I agree with @astrofrog that waiting to hear from numpy on this first makes sense.

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 21, 2014

@eteq - Aha, didn't know about that! Yes, I think it makes a lot of sense to move it to utils.compat, so I moved it there. I'm fine with only putting things there that are accepted by numpy, as we should avoid at all cost to effectively start to fork it.

One question, perhaps mostly to @taldcroft, there also is a numpycompat.py in astropy.utils.compat. Is this actually used? It looks like it might be intended for MaskedColumn. The method, though, is different from mine, in that it patches the numpy version one is running. This seems riskier. It hopefully will no longer be necessary, however, with further work on getting a patched MaskedArray here (#1857).

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 21, 2014

@eteq, others - would you have a suggestion for where to document the procedure for including numpy compatibility code? I saw that we specifically exclude utils.compat from the normal documentation. But perhaps more relevant would be somewhere in the developer's instructions?

p.s. travis failure due to samp...

@eteq
Copy link
Copy Markdown
Member

eteq commented Apr 21, 2014

@mhvk - I'm fine with just adding a subsection in the astropy.utils docs to discuss that - it's already sort of a grab-bag that's not quite as polished as the other pages.

(restarted Travis, hopefully that'll do the trick)

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Apr 23, 2014

@eteq, others - made a first cut at the documentation; please check whether this makes sense!

@eteq
Copy link
Copy Markdown
Member

eteq commented Apr 23, 2014

Seems fine to me - @mdboom and @astrofrog ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does PR4622 need to be exposed at the top level here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been going to and fro about this. Essentially, I would like to make it easy to tell whether a patch is still needed. But probably, it is just as well done by testing whether even for the lowest support numpy version broadcast_arrays is np.broadcast_arrays.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- I think it just clutters the API here. A user that really wants to know can import stride_tricks directly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I removed it, and updated the other files + docs to reflect that.

@embray
Copy link
Copy Markdown
Member

embray commented Aug 13, 2014

Any reason at this point not to move forward on this?

@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Sep 26, 2014

@embray - I had not done anything about the broadcast_arrays functionality since it was pending acceptance by numpy. That has now happened, so I rebased this using the final merged version, and also ensured it actually gets used in coordinates.representation. Please do check whether you think the changelog entry is OK. Also, I wonder if it is possible in some way to ensure a list of patched functions ends up in the documentation.

Indeed, more generally, would it make sense to import numpy somewhere below astropy, and overlay the changed versions from astropy.utils.compat.numpy? But perhaps that is for a future PR.

@mhvk mhvk added Affects-dev PRs and issues that do not impact an existing Astropy release Ready-for-final-review labels Sep 26, 2014
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this fix has been included in an actual Numpy release yet, but regardless can we just make this a version check instead?

Considering that we already provide a full replacement here I see no reason to add this additional complication to support the minority of users who would be using a dev build of Numpy.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup, then the as_strided and broadcast_arrays defined below don't need to be defined under an if statement block. Just define them normally and only import them into the numpy.compat namespace if the version check fails (otherwise just import them straight from numpy)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had a version check, but @mdboom suggested I replace it... (see his first comment; not sure how to link to it).
One thing I do like about it is that it helps make clear why the feature is there in the first place.

However, on your second point, it would seem to make more sense to import broadcast_arrays and as_strided only when needed, i.e., something like:

if PR4622():
    from numpy.lib.stride_tricks import as_strided, broadcast_arrays
else:
    def as_strided(...
        ...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sort of disagree there. It's much easier to just have a version check along with a comment explaining why it's there. But it's not too important--I won't insist you change it back now that you've already gone to the trouble :) Though I do think making the import conditional is the way to go.

@mhvk mhvk force-pushed the util-numpy-broadcast-arrays branch 3 times, most recently from 64cddf9 to 5348946 Compare October 28, 2014 19:06
@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Oct 28, 2014

@embray - it took longer than hoped to get back to this. I ended up leaving what is imported conditional on the function rather than using a version check, mostly because I felt the function should exist anyway, and if it does, one might as well use it... I did rename it to indicate the numpy version where it will be available, though (since the PR# can be found in the doc string anyway).

Assuming travis still passes, I think with this it is ready to go in...

@mhvk mhvk force-pushed the util-numpy-broadcast-arrays branch from 5348946 to 1e5c9d5 Compare November 6, 2014 14:00
@mhvk mhvk mentioned this pull request Nov 6, 2014
@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Nov 6, 2014

@embray - do you think this is ready to go in?

@embray embray mentioned this pull request Nov 7, 2014
@mhvk
Copy link
Copy Markdown
Contributor Author

mhvk commented Nov 12, 2014

@embray - I'd like to merge this. Would you like to have another look before I do so?

@embray
Copy link
Copy Markdown
Member

embray commented Nov 12, 2014

Not really; seems legit. :shipit:

mhvk added a commit that referenced this pull request Nov 15, 2014
Provide np.broadcast_arrays replacement that allows subclasses
@mhvk mhvk merged commit 9598a45 into astropy:master Nov 15, 2014
@mhvk mhvk deleted the util-numpy-broadcast-arrays branch November 15, 2014 17:13
@mhvk mhvk mentioned this pull request Jan 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Affects-dev PRs and issues that do not impact an existing Astropy release units utils

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants