Skip to content

Handling of array parameters and model sets#2634

Merged
embray merged 30 commits into
astropy:masterfrom
embray:modeling/model-sets
Jun 26, 2014
Merged

Handling of array parameters and model sets#2634
embray merged 30 commits into
astropy:masterfrom
embray:modeling/model-sets

Conversation

@embray

@embray embray commented Jun 16, 2014

Copy link
Copy Markdown
Member

The purpose of this PR is to address the basic issue raised in #2049, as well as a broader generalization of that issue, in a way that (I hope) is coherent.

Problem background

The basic issue raised in #2049 (and earlier, in different terms, by #1680) is that earlier prototypes of the modeling package were designed to support model parameters that can have array-like values. The example often raised is in what is now the AffineTransformation2D model, which takes a 2x2 matrix as one of its parameters. But it has also been requested that in principle parameter values in general should be allowed to be arrays or scalars. This is complicated somewhat by the fact that all parameter values must be compatible with each other for broadcasting; for example it has been requested that it be possible to do something like:

model = MyModel(param1=np.array([[1,2], [3.,4]]), param2=4.5)

which is fine, since here param2 is a scalar and is likely to broadcast with param1 when they are used together in any Numpy ufunc, for example. However if all parameters can be arrays, then in principle something like this is possible:

model = MyModel(param1=np.array([[1,2], [3.,4]]), param2=np.array([1, 2, 3])

except that the way most models are implemented, that won't work. On some level, if mixing parameter shapes, it's up to the user's discretion to know whether or not what they're doing is making sense. But we still want to make sure sensible error messages are raised if things go wrong.

That's the first of two major issues addressed by this PR. However, there is another factor that complicates matters:

Model sets (formerly, parameter sets)

The above issue was partially addressed by #1680. But that fix punted on the issue of supporting multiple "parameter sets" (i.e. models with param_dim=N where N > 1). The purpose of this feature is to allow a single Model instance to actually represent a whole collection of models of the same type, but with potentially different values for each fittable parameter. The use of this is primarily for fitting--for example it would enable creating a whole "set" of Gaussians, each one of which is fit to a different plane in a data cube in parallel. In many cases this would be more efficient than, for example, looping over each plan in Python in fitting separate Gaussian2D instances to them one by one.

Previously these were referred to as "parameter sets", but I've started referring to them as "model sets" since I think that makes clearer that we're really talking about a collection of related models. I don't think either terminology is more right or wrong than the other, and am open to other suggestions.

Prior to this PR there were two ways to instantiate a model with multiple model sets. The first, when instantiating a model, was to specify a keyword argument param_dim=N where N is an integer specifying the number of models in the set. Each parameter value was then required to be a list, the length of which should match the number of models:

model = MyModel(param1=[1, 2], param2=[3, 4], param_dim=2)

Equivalently, if the value given to a parameter was a Numpy array, the length of the first axis needed to match the value of param_dim:

model = MyModel(param1=np.array([[1, 2, 3], [4, 5, 6]]), param2=[3, 4], param_dim=2)

The above example would create a set of two models, each where the value of param1 is an array of shape (3,).

Unfortunately, my fix to #1680 did not effectively address the above case, where there are multiple models, while still allowing array-like values for individual parameters in each model.

The other way of making a multiple model set, which was employed by most of the models in astropy.modeling.function_models, and which I (and others) considered a misfeature was that if the first parameter of a model (typically an "amplitude" scaling coefficient) was an array, then the length of the first axis of that array automatically determined the size of the model set. That is, in the case of Gaussian1D something like:

g = Gaussian1D(amplitude=[1, 2], mean=[0.1, 0.2], stddev=[0.1, 0.2])

automatically created a set of two Gaussians. It was impossible to allow a single Gaussian whose parameters happened to be array values.

Handling inputs to model sets

Further complicating matters is how to handle inputs to Model.__call__ for instances representing a multiple model set. For example, given

g = Gaussian1D(amplitude=[1, 2], mean=[0.1, 0.2], stddev=[0.1, 0.2])

there are several use-cases for evaluation. The first, and simplest, is evaluation on a scalar value like:

g(1.2)

This case is straightforward--it simply evaluates each model in the set on x=1.2. But we also want to pass in arrays to models, and there are two different things this could represent:

  1. An array whose length is the same as the number of models N, meaning map each element in this array 1-1 with a model in the set.
  2. Simply evaluate all models in the set on the given array.

That is, why might want to read:

g([1.2, 2.3])

as "simultaneously evaluate the model with amplitude=1, mean=0.1, stddev=0.1 at x=1.2 and the model with amplitude=2, mean=0.2, stddev=0.2 at x=2.3".

OR, we might want to read it as "simultaneously evaluate each of x=[1.2, 2.3] across all models in the set, elementwise".

In the current implementation, the latter interpretation is the behavior for 1-D inputs, but the former interpretation is the behavior for ndim >= 3 inputs. For 2-D inputs the number of columns must match the number of models. The current rules are a reasonable compromise for simplicity, and are internally consistent. But from a user perspective they may seem confusing ("how I do need to shape my arguments for this particular model?"), and do not simultaneously allow both of the above use cases for inputs of any size/shape (scalar, or array).

Proposed solution

For a quick overview of the new proposed implementation you can read it right in this PR in this docstring: https://github.com/astropy/astropy/pull/2634/files#diff-a619f3fef9ad65068ecd1790cebc242aR16
(note, this is probably not the final resting place of that documentation; I am still in the process of updating all the documentation, and as such this PR is not quite ready to be merged)

As this is a multi-pronged problem, the solution is also multi-pronged.

Single models

The first aspect of the solution is that for single models (i.e. not model/parameter sets) everything works as before. For the most basic of one model, with scalar parameters one instantiates it like:

>>> m = MyModel(1, 2, 3)

The model can be evaluated on either a scalar or array input. The output will have the same shape as the input:

>>> m(1)  # these are just sample values for an arbitrary model
2
>>> m([[1, 2], [3, 4]])
array([[2, 4],
       [6, 8], [10, 12]])

The same applies when one or more of the parameters have array-like values. The only general requirement is that the parameter values and the inputs are all mutually compatible according to the standard Numpy array broadcasting rules. The output has the shape that results from broadcasting together all the parameters and inputs. There is at least one caveat to this approach that I will discuss below.

Model sets

Instantiating a Model that represents a set of multiple models with their own parameter values always requires stating this explicitly. O'er in #2149, as well as in other discussions it was suggested that the existing keyword argument for specifying sets, param_dim was confusing. In retrospect it makes sense if you think of a model set as an array of parameter values with an axis specifying which model a given parameter belongs to. In that sense, param_dim is giving the size of that model axis or "dimension".

This implementation preserves that concept, but does go ahead and change the name of the corresponding argument to n_models. For example:

m = MyModel([1, 2], [3, 4], [5, 6], n_models=2)

This has the same semantics as in the previous implementation. The only difference is the name, and the fact that it is required, for any model, to instantiate a model set. It also allows the parameter values to arrays (though the values for a single parameter must have the same shape across all models in the set). For example:

m = MyModel(param1=[[[1, 2], [3, 4]], [[5, 6], [7, 8]]],
            param2=[[10, 11], [12, 13]], param3=[14, 15], n_models=2)

As before, the first dimension represents what I'm calling the "model_set_axis". This is the axis of the array that determines which model a parameter values belongs to. Its length, for all the parameters, is 2. In this case the values of param1 are 2x2 arrays, the values of param2 are 1-D arrays, and the values of param3 are scalars.

Another alternative that was discussed and that looked promising was to use a boolean argument like multiple_models. If True, it would just require that all parameter values have the same length in their first dimension, and that the number of models would be determined from that common length. Unfortunately that doesn't work in the case of models that have default values for all their parameters. The main example of this in the current collection of models is the polynomial models, which have a default of zero for all their coefficients. It should be possible to write:

p = SomePolynomialType(degree=3, n_models=2)

without having to manually specify something like [0, 0, 0] for all the default parameter values. We might be able to get away with a different approach, for example, by allowing specification of the shapes of parameter values without actually specifying the values. But for now, this approach introduces the least disruption to the existing assumptions.

Evaluation

Handling of inputs to model sets is where there is some significant departure from the current implementation. The main differences are that for n_models=N with N > 1

  1. By default, all inputs must be an array the first dimension/axis of which is treated as the model_set_axis, mapping input values to models in the set. For example for n_models=2 the input must be in the form[..., ...] where the ... may be either scalars or arrays so long as they broadcast sanely. One exception is made, for convenience, in the case of a single scalar input, like m(0). In this case all models in the set are evaluated at the same input value.
  2. All outputs from the model are arrays, the first dimension/axis of which still corresponds to the model_set_axis, and is of length N. Therefore the inputs and outputs always have the same length in their first dimension (the subsequent dimensions may be different depending on broadcasting rules).

What, then, of wanting to evaluate a single array value across all models in the set? In order to handle this unambiguously, when evaluating the model it is necessary to supply an additional argument model_set_axis=False. For example:

>>> m = MyModel(param1=[1, 2], param2=[3, 4], param3=[5, 6], n_models=2)
>>> x = np.arange(10)
>>> m(x)
Traceback ...
InputShapeError...
>>> m(x, model_set_axis=False)
array([[10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
       [100, 200, 300, 400, 500, 600, 700, 800, 900, 100]])

Trying to evaluate the 2-model set on a 10 element array would fail, because its length does not match the number of models. But supplying model_set_axis=False means "whatever the shape of this array, none of its dimensions map to a model in the model set". Which in turn implies broadcasting the value across all models. The output still has a length equal to n_models for its first dimension. But the subsequent dimensions are the result of broadcasting the input with the parameter values of each model in the set.

The following tests illustrate several more examples of what happens with parameters and inputs of different shapes: https://github.com/astropy/astropy/pull/2634/files#diff-5a6c3ec736446d944478e4ecb9cd19b4R485 But again the key point here to emphasize is that outputs have length of n_models (in their first dimension, when n_models>1).

Arbitrary model_set_axis

The previous example showed how model_set_axis can be disabled for an input, implying that it does not have an axis mapping inputs to models in a set. Alternatively, one may specify an integer for model_set_axis. By default model_set_axis=0--that is, the first dimension is equal to the number of models. But a different axis may be specified for this purpose. A model may also have a different default model_set_axis for its parameters. I won't go into much detail on this for now, as I don't think this is likely to be used very commonly, as it requires careful consideration on the part of the user. But it may be useful for some performance critical cases in order to avoid having to swap the axes of an array around, resulting in a non-contiguous input array.

If that sounds unclear, don't worry, you're not alone. Some examples of this would help, if I have time to write them...

Caveats

  1. I haven't done much performance testing, but it may be much slower, especially in the case of fitting. Fortunately, once this work is merged with the work presented in Model refactoring stage 1: "evaluate" normalization #2580, that slowdown will be all but eliminated. If the regression turns out to be serious and noticeable I can easily backport the key pieces of Model refactoring stage 1: "evaluate" normalization #2580 to 0.4.x.
  2. There are still some details that are glossed over regarding models with multiple inputs and multiple outputs. In particular where the number of outputs is greater than the number of inputs--I think this currently results in a crash. I don't believe there are any cases of that in the wild yet, but it's good to be aware of. [This issue was at least partially addressed by Modeling/handle more out than in #3250]
  3. The Numpy broadcasting rules don't necessarily apply to how all models are evaluated. Fortunately, they do apply to most. The only model currently built into Astropy that had a problem with this is AffineTransformation2D. That model takes two inputs: an "x" coordinate and a "y" coordinate. Or it can take an array of x coordinates and an array of y coordinates. In then uses np.vstack on these inputs to create a 2 x N array that can be dotted with the transformation matrix. For now I punted on the issue by providing a flag model implementations can set that implies "I don't follow the standard rules; let me handle input validation on my own."

I'd welcome better suggestions here. But maybe it will turn out that this is good enough if there are few enough exceptions.

To do on this PR

This PR is not quite ready to be merged. If nothing else the user documentation for models requires further updating. If there are no serious objections to this general approach, we would like to include this as a fix for #2049 in Astropy 0.4. Minor objections/quibbles can either be addressed before merging if small, or in Astropy 0.4.x. This PR doesn't purport to be an end-all be-all solution for all possible cases, but it gets a good ways closer, so long as nobody opposes the general approach.

Future enhancements

One thing I still want to do with this is that make clearer the distinction between a model and a "model set", and give a little more power to the concept. Although I like using the same class for both cases, it should be clearer when a given Model represents a set of models. I also want to implement an indexing feature, allowing something like:

>>> g = Gaussian1D(amplitude=[1, 2], mean=[0, 1], stddev=[0.1, 0.5])
>>> g[0]
<Gaussian1D(amplitude=1, mean=0, stddev=0.1)>
>>> g[1]
<Gaussian1D(amplitude=2, mean=1, stddev=0.5)>

That is, indexing a model set returns a single model instance representing one of the models in that set, which can be evaluated individually, etc. Updates to the parameters in that model would also update the original model set. This enhancement is fairly easy to implement, and I wanted to have it originally as part of this PR. But it was left out for now due to time constraints, and due to the fact that some of the work I'm doing in parallel with this will make it easier to implement.

embray added 16 commits June 16, 2014 13:41
tests working.  Some tests had to be updated to *explicitly* set
model_set_axis=0.  This is because they used model types that previously
set their param_dim implicitly from the shape of the first
parameter--the type of ambiguous implicit action we're trying to get
away from.

As a result I believe there were some tests that intended to test single
models (i.e. param_dim=1) with array-valued parameters, but didn't
actually do this.

As such, array-valued parameters are not yet working quite properly, but
that will come in the next changeset.

Another annoyance with this so far is that in order to use
model_set_axis to specify the set size it is *necessary* to provide a
value for at least one parameter.  This seems arbitrary and awkward for
models that have default values for all their parameters.  It should be
possible to initialize those without providing any parameter values.

For that reason it might be worth introducing an n_models argument with
the same semantics and param_dim (some people in previous discussions
were okay with the name 'n_models').  I'd also consider model_set_size.
model_set_size could work in tandem with model_set_axis and one, the
other, or both could be specified (if both they would obviously have to
agree).
…new system. I'm not sure how much longer I want to keep this property around in its present form at all, but for now I tried to manimize the changes necessary to any other code or tests. One thing that is slightly different now is that *previously* the last axis of the param_sets array was always the same size as the number of param sets (that is, the last axis was the param set axis). Now that is no longer guaranteed--instead it corresponds to whatever axis was specified by model_set_axis. We could roll that axis to the end, but for now I prefer preserving the contiguity of each parameter array).
…ument (with backwards compatibility support for param_dim). This was necessary in cases where no initial values are provided for any parameters (all parameters have default values), and is also simply clearer in other cases. The model_set_axis argument is still needed in addition to this to meet all use cases. However, when n_models > 1, model_set_axis defaults to zero unless otherwise specified. Setting model_set_axis=False disables it entirely.
… without multiple parameter sets) can broadcast with each other in a logical way).
…f all arrays broadcast with one array then they all mutually broadcast. We actually have to check all shapes against each other to ensure broadcastability. Fortunately this can be done fairly easily on the order of the number of params * the max param dimension. In fixing this I think I significantly simplified the logic in _check_param_broadcast so that it's a good deal more coherent.
…s an exception if any of the shapes are incompatible (which is fine since this is always an error situation anyways) and if the check succeeds it will return the full shape of all the arrays broadcast with each other.
…nations working properly. There are still more cases that need to be tested, and the code is unfortunately very complicated and confusing. I think it can probably be made simpler by breaking it into a few different subroutines for different cases, rather than handling all possible cases in the same loop.
…even when n_models > 1 since the intent is fairly unambiguous and they are always broadcastable.
… prose in the docs still needs to be updated, but all the doc tests are working now.
…us model sets. This results in a *little* bit of code duplication that might be avoidable with a little more thought. But for now this made it a little easier to follow and debug. Added a workaround for models that don't follow the broadcasting assumptions made by format_input; so far the one case we have for that is the AffineTransformation model.
…of this should probably be moved from the module docstring to the Sphinx docs, as it is usage documentation. It would make more sense for the module docstring itself to provide some overview of what's in the module, and not necessarily how it works at a high level.
…in modeling.core for now since they give slightly, unsubstantially different outputs on Python 3, and are likely to be moved to the Sphinx docs anyway.
@embray embray added this to the v0.4.0 milestone Jun 16, 2014
@embray embray changed the title Handling of model sets Handling of array parameters model sets Jun 16, 2014
@embray embray changed the title Handling of array parameters model sets Handling of array parameters and model sets Jun 16, 2014
@mdboom

mdboom commented Jun 16, 2014

Copy link
Copy Markdown
Contributor

This looks good, and I certainly see the need for this to properly vectorize things and make broadcasting work well etc.

I have couple of comments about the API, and I apologize that I haven't followed this too closely until now, so I'm not offended if the response is "too late" or "already discussed". There are some things about the API that make sense after the lengthy explanation of how we got here, but which really seem like they are going to be hard to remember how to use...

What if, rather than these special keywords, n_models and model_set_params, ModelSet really was it's own class, or at least looks that way (even if the implementation is largely the same as what's here). So, if you instantiate a model, the assumption is that it's always n_model == 1. To create a model set, you would instantiate a ModelSet with a list of models, so instead of

MyModel([1, 2], [3, 4], [5, 6], n_models=2)

it would be:

ModelSet([MyModel(1, 3, 5), MyModel(2, 4, 6)])

I understand there's a performance hit there (more objects being created and destroyed), but I'm assuming the number of models in a set and the parameters of a model are never very large. An alternative that avoids this might be:

ModelSet(MyModel, [1, 2], [3, 4], [5, 6])

(Note, ModelSet may not be its own class, but merely a factory function, since the resulting object will be, and should be, more like MyModel than anything else).

When evaluating a model set, I think the default behavior you describe above (that the input dimensions must match the number of models in the set) is correct. However, I think the model_set_params=False mode is sufficiently different that maybe it should grow its own method instead. Maybe m.apply(x). If you added a __getitem__ to get a specific model in the model set (as you proposed near the end), that means apply(m, x) would do the same thing as m(x, model_set_params=False) (albeit slower), so m.apply(x) seems like a good name for it.

@embray

embray commented Jun 17, 2014

Copy link
Copy Markdown
Member Author

Thanks for looking at that and commenting. I don't think your API ideas are retreating any old ground.

The idea of using a separate class for model sets definitely had occurred to me. Another possibility was using the same class but a different constructor (such as a classmethod). I think that if a separate class were used it would be a bit tricky though since the resulting object is still tied to a specific model class, and would have to work transparently as a replacement for Model in several cases. I'm definitely not opposed in principle, but for now I wanted to implement these changes in a way that caused the least disruption to existing system (backwards compatibility is even provided for param_dim, though the formatting of the inputs/outputs breaks backwards compatibility somewhat).

The idea of apply() method, on the other hand, I do like the idea of and I'm warming up to it. I should emphasize that there is still a use case for a model_set_axis argument (or something like it; I'm open to a different name) to specify which axis of the input to apply from. I think the case is rare, but possibly necessary to ensure continuity of the input array during evaluation. But the use of False as a special placeholder value (as in model_set_axis=False) to invoke different behavior is definitely a kludge that I'm not entirely happy with.

@cdeil

cdeil commented Jun 17, 2014

Copy link
Copy Markdown
Member

👍 to merging this before 0.4 as a cleanup of the terminology, behaviour and docs for array parameters and model sets ... @embray thanks for putting in so much work into astropy.modeling.

Could you please add one or a few real use cases why model sets are needed at all to the docs? The main (only?) argument for having model sets seems to be performance, but in the absence of a real use case or any timing measurements it's a bit difficult to understand that it's really warranted to complicate the model implementation by making it support model sets.

Simply leaving it to the user to create and use model sets (e.g. as Python lists or by writing their own classes or as an extra class in Astropy as suggested by @mdboom above) is an option that I think hasn't been discussed for astropy.modeling. It's clearly possible, as far as I know model sets don't exist in the most-used modeling / fitting packages by astronomers (Sherpa) or physicists (RooFit).

Looking a bit into the future ... do model sets as implemented here play well with composite models (see #1054) and storing / computations involving the fit covariance matrix (see #1055)? (Again ... is it warranted to have parameter sets at all given that probably only a very small fraction of astropy.modeling users have a need for it?)

@embray Let us know when this is ready for final review (at the moment the docs build on travis-ci fails because some plotting example hasn't been updated).

@astrofrog

Copy link
Copy Markdown
Member

I've read through everything here (except the code itself, which I'll look over now) and I'm +1 on these changes. I personally don't think we need a separate ModelSet class, as I find the API here clear enough.

I don't think we should worry much about performance here - the priority for 0.4 is to get the API cleaned up. Performance can be a 1.0 goal.

@astrofrog

Copy link
Copy Markdown
Member

Regarding the comment by @mdboom about model sets not having too many models, I can think of cases where one might want to fit Gaussians to a spectral cube along the spectral dimension for e.g. 1000x1000 pixels, so 10^6 models would be required in that case. Of course, the fitting itself will be super slow, but just to say that we may want a way to at least represent such a model without requiring the model class be instantiated separately for each pixel.

@perrygreenfield

Copy link
Copy Markdown
Member

Exactly right. Many of our use cases for this involve large numbers of sets. Our most immediate need, for example, is fitting a nonlinear polynomial model to each pixel of a IR detector, so we are talking millions of sets. So the solution proposed by Mike is not going to work very well in that case. It is true that currently nonlinear fitters don't take advantage of that for performance, but I do believe linear ones do take great advantage of it for performance.

@mhvk

mhvk commented Jun 17, 2014

Copy link
Copy Markdown
Contributor

@embray - only read your description so far, but really like it. As commented by @cdeil, use cases would be very useful, and I think the comments already provide at least one, for a nicely astronomical one of getting positions and brightnesses of stars:

  • fit multiple gaussians to stars in one image -> use arrays for the parameters
  • fit a single gaussian to one star in multiple images -> use model set
  • fit multiple guassians to stars in multiple images -> arrays and model set

I think this is very powerful, especially as one can, say, easily fix the FWHM to be the same for all stars by not making that parameter an array. Ideally, one would also be able to fix parameters between images, and this suggests that you should follow numpy broadcasting rules even more stringently (if you do not do so already, I'm not sure), i.e., not to insist that the model_axis has length n_models. Similarly, it might be an idea to simply prepend an axis with dimension 1 for inputs for which the first dimension is not equal to n_models, and let broadcasting rules take over from then on.

p.s. Note that @perrygreenfield's example of fitting non-linearity curves to a whole image suggests that n_models could also be an array. I don't see an obvious reason to disallow that... Indeed, a similar example would be fitting the flux of a single object in a time series of spectral cubes...

@embray

embray commented Jun 17, 2014

Copy link
Copy Markdown
Member Author

Regarding model sets, it seems their use for fitting a large number of models has already been explained. One point I will make, however, is that some of the refactoring I'm doing on the fitting implementation may make it unnecessary later. It may be possible to say, for example, "Here's a 100x1000x1000 image cube--return me 10^6 polynomials, one for each pixel, fitted to the time axis" and simply get back 10x6 polynomial models, though perhaps sharing some underlying memory buffer that was used in the actual fitting.

At the same time, the overhead of that many Python objects might be overkill on its own.

embray added 2 commits June 20, 2014 16:53
…f doctests that are tricky to get passing on all platforms due to floating point formatting and slight numerical differences. Disabling these doesn't impact the test coverage, and I think the cases they cover are handled in unit tests as well.
@embray

embray commented Jun 23, 2014

Copy link
Copy Markdown
Member Author

The tests for this are failing now because some of the tests are requiring scipy, but not being marked as such, which should be fixed.

However the bigger question I have no is, why isn't scipy available? It used to work just fine. Does this have something to do with the switch to using miniconda?

@astrofrog

Copy link
Copy Markdown
Member

@embray - scipy is only installed on the couple of builds that have OPTIONAL_DEPS=true. We deliberately test in most cases without scipy to make sure nothing breaks (since scipy is an optional dependency). This was the same before conda.

@astrofrog

Copy link
Copy Markdown
Member

Looks like the remaining failure is just a minor precision issue.

@embray

embray commented Jun 24, 2014

Copy link
Copy Markdown
Member Author

Ugh. Maybe I'll just try to get #2087 finished and merged first :)

@embray

embray commented Jun 25, 2014

Copy link
Copy Markdown
Member Author

Hah. One "random" failure. Restarting that build...

@astrofrog

Copy link
Copy Markdown
Member

The tests pass! :)

@embray

embray commented Jun 25, 2014

Copy link
Copy Markdown
Member Author

Should still say something in the changelog about this....

@wkerzendorf

Copy link
Copy Markdown
Member

@embray I only now had a chance to have a more thorough look. This is great!!! I really like the updated and cleaned up terminology. I won't use the ModelSet directly, but I think this is a big leap in the right direction!

@embray

embray commented Jun 26, 2014

Copy link
Copy Markdown
Member Author

@wkerzendorf Thanks; I think there's still room for improvement but I agree it's a step in the right direction for clarifying usage.

embray added a commit that referenced this pull request Jun 26, 2014
Handling of array parameters and model sets
@embray embray merged commit 0839230 into astropy:master Jun 26, 2014
@embray embray deleted the modeling/model-sets branch June 26, 2014 16:19
@mhvk

mhvk commented Jun 26, 2014

Copy link
Copy Markdown
Contributor

Great! For tracking of 0.4-issues: Does this resolve #2049 and #2149?

embray added a commit to embray/astropy that referenced this pull request Dec 19, 2014
to the original report of this issue, explaining how I solved the
problem:

As I suspected, this had to do with a concession I had to make way back in this PR:

astropy#2634

This concession is mentioned in the section of the PR text "Caveats" under item
3.  The rules I came up with for broadcasting arrays work for the vast majority
of models, but there are some (particularly ones like AffineTransformation2D
that involve multiple coordinates) where they don't quite work.

As a temporary workaround I added a `standard_broadcasting = False` flag to
these models--this allows them to opt out of the standard assumptions (and, if
they wish, implement their own prescriptions by overriding prepare_inputs and
prepare_outputs).

The trick here was to make sure that a compound model containing
AffineTransformation2D (or others like it) propagate the
 `standard_broadcasting = False` flag.  If one model in a compound model has
this limitation, then it must, for now, be applied to the entire model.

In the near future I hope to do something a little less brute-force here.  I
envision models being able to provide some sort of broadcast_hints data
structure, which I think can be fairly simple.  Most models wouldn't need to
use this (the defaults are fine).  But this could explain to the code exactly
what inputs and parameters are used with each other, and how they should fit
with each other.  (Obviously reimplementing prepare_inputs/outputs could do
this too, but I think the realistic cases are still limited enough that this
can be done in a declarative manner and avoid having to reimplement lots of
code over and over again).

In the meantime, the fix I will add shortly should be good, I think, for most
cases we care about?
embray added a commit to embray/astropy that referenced this pull request Dec 23, 2014
to the original report of this issue, explaining how I solved the
problem:

As I suspected, this had to do with a concession I had to make way back in this PR:

astropy#2634

This concession is mentioned in the section of the PR text "Caveats" under item
3.  The rules I came up with for broadcasting arrays work for the vast majority
of models, but there are some (particularly ones like AffineTransformation2D
that involve multiple coordinates) where they don't quite work.

As a temporary workaround I added a `standard_broadcasting = False` flag to
these models--this allows them to opt out of the standard assumptions (and, if
they wish, implement their own prescriptions by overriding prepare_inputs and
prepare_outputs).

The trick here was to make sure that a compound model containing
AffineTransformation2D (or others like it) propagate the
 `standard_broadcasting = False` flag.  If one model in a compound model has
this limitation, then it must, for now, be applied to the entire model.

In the near future I hope to do something a little less brute-force here.  I
envision models being able to provide some sort of broadcast_hints data
structure, which I think can be fairly simple.  Most models wouldn't need to
use this (the defaults are fine).  But this could explain to the code exactly
what inputs and parameters are used with each other, and how they should fit
with each other.  (Obviously reimplementing prepare_inputs/outputs could do
this too, but I think the realistic cases are still limited enough that this
can be done in a declarative manner and avoid having to reimplement lots of
code over and over again).

In the meantime, the fix I will add shortly should be good, I think, for most
cases we care about?
embray added a commit to embray/astropy that referenced this pull request Dec 30, 2014
to the original report of this issue, explaining how I solved the
problem:

As I suspected, this had to do with a concession I had to make way back in this PR:

astropy#2634

This concession is mentioned in the section of the PR text "Caveats" under item
3.  The rules I came up with for broadcasting arrays work for the vast majority
of models, but there are some (particularly ones like AffineTransformation2D
that involve multiple coordinates) where they don't quite work.

As a temporary workaround I added a `standard_broadcasting = False` flag to
these models--this allows them to opt out of the standard assumptions (and, if
they wish, implement their own prescriptions by overriding prepare_inputs and
prepare_outputs).

The trick here was to make sure that a compound model containing
AffineTransformation2D (or others like it) propagate the
 `standard_broadcasting = False` flag.  If one model in a compound model has
this limitation, then it must, for now, be applied to the entire model.

In the near future I hope to do something a little less brute-force here.  I
envision models being able to provide some sort of broadcast_hints data
structure, which I think can be fairly simple.  Most models wouldn't need to
use this (the defaults are fine).  But this could explain to the code exactly
what inputs and parameters are used with each other, and how they should fit
with each other.  (Obviously reimplementing prepare_inputs/outputs could do
this too, but I think the realistic cases are still limited enough that this
can be done in a declarative manner and avoid having to reimplement lots of
code over and over again).

In the meantime, the fix I will add shortly should be good, I think, for most
cases we care about?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants