Separate optimization from statistic by nden · Pull Request #1914 · astropy/astropy

nden · 2013-12-18T17:04:52Z

This PR separates optimization from statistic and allows creating fitters by combining optimizers and statistic functions.

Because of the many changes It's probably best to review this by looking at the entire files (fitting.py, optimizers.py, statistic.py) and not the diffs in the PR.

I am proposing that the name of the fitter is constructed from the names of the optimizer and the statistic, so some existing fitters have been renamed:

SLSQPFitter --> SLSQPLSQFitter
NonLinearLSQFitter --> LevMarLSQFitter

Statistics are implemented as functions (but can be any callable) and are stored currently in ~astropy.modeling.statistic.py. Currently there are two leastsquare and (following @cdeil's example in a different PR) cash. The input to the statistic functions is func(measured_values, evaluated_values, weights).

Optimization methods are stored in ~astropy.modeling.optimizers.py. They are implemented as classes in order to store additional information about the optimization properties but can be any callable.

Dealing with model's constraints has been moved to optimizers. Some methods of the old Fitter class were made functions.

I still need to add tests for the new fitter and possibly write a custom_fitter. Other than that I have pretty much completed what I had in mind and would appreciate comments on overall design, statistic implementation, names and any other improvements.

cdeil · 2013-12-18T17:22:32Z

Cool!

I can have a look in a few days, just one comment about this:

The input to the statistic functions is func(measured_values, evaluated_values, weights)

Is it possible to have these inputs to the fit statistic more flexible?

One of the fit statistics I'd like to implement in astropy that is common in X-ray and gamma-ray astronomy is wstat:
http://heasarc.nasa.gov/xanadu/xspec/manual/XSappendixStatistics.html

It doesn't take "weights" as input, like chi^2 statistics do, but instead a few other quantities.

Here's an example where @registerrier implemented wstat as a user-defined statistic in Sherpa:
https://gist.github.com/cdeil/8026184

It would be nice if the astropy.modeling API for fit statistics were more general so that such hacks are not needed to implement more complex fit statistics like wstat.

If this is already the case, apologies.
I'll take a closer look at the API that you propose here and will try implement wstat as an example in a few days.

Thanks for doing this!

nden · 2013-12-18T18:11:44Z

The only hard requirement is that statistic is callable.
However, if it is in the above form, fitters can use the base class Fitter.errorfunc. Otherwise they have to implement their own errorfunc to match the interface of the statistic.

This can probably be improved but I am putting it out for discussion. It'd be great to have more examples, looking forward to the wstat example.

nden · 2013-12-30T22:47:25Z

The current Travis failure is an example in whatsnew/0.3.rst, failing due to a name change. The proposed name change, if accepted, is going to be in 0.4 (NonLinearLSQFitter -- > LevMarLSQFitter) while the failing example is for 0.3.

Is there a way to skip a test based on the version of astropy?
Or perhaps this should be skipped entirely.

cdeil · 2014-02-19T13:15:00Z

@nden I don't know how to get rid of the travis-ci test failure you mentioned in the what's new example. Maybe @astrofrog knows or you can ask on the astropy-dev list ... this will probably be a common problem as astropy ages...

nden · 2014-02-19T13:17:27Z

@cdeil I just pushed an update which I hope will make it easier to add a wstat statistic. Can you confirm this? I need to work some momre on this but wanted to give you an idea.

cdeil · 2014-02-19T13:17:37Z

I'll try this out now ... maybe @astrofrog and / or @embray, you can have a look, too?
This is a major improvement that has been lying around for two months and it should go into master soon so that it gets some real-world testing before the 0.4 release.

cdeil · 2014-02-19T13:19:20Z

@nden Can you please add an entry to the CHANGES document?
(I think that's what we should do for modeling for 0.4 ... note major changes, but not start adding deprecation warnings)

cdeil · 2014-02-19T13:21:37Z

Add chi_line to __all__?

nden · 2014-02-19T13:26:54Z

It's not quite ready for review. I just want to make sure it's flexible enough to add other statistic functions and optimizers.

I don't know if there's much sense in working on new features until #2049 is resolved since everything in modeling depends on how parameters are handled but @embray is working on it.

cdeil · 2014-02-19T13:31:00Z

The fit statistics functions (leastsquare, cash and chi_line) need docstrings.

I have re-implemented all build-in fit statistics from Sherpa in gammapy.stats.fit_statistics, see e.g. gammapy.stats.fit_statistics.cash for an example.

Some points for discussion about the fit statistics functions:

Should fit statistics return the sum over bins or the per-bin fit statistic? (Sherpa returns both I think)
Do we need the chi_line fit statistic in addition to the leastsquare fit statistic? (see the various chi_square fit statistics in Sherpa / gammapy)
Can we avoid having special fit statistics for 1-dim, 2-dim, ... data? (If yes, there should be no fit statistic with name *_line)

cdeil · 2014-02-19T13:33:56Z

@nden I wasn't aware that #2049 is the bottleneck at the moment ... in this case I'll continue looking at this later.

cdeil · 2014-02-19T13:39:11Z

Just one more comment ... I suggest we rename Fitter.errorfunc to Fitter.fit_statistic or Fitter.objective_function, which are more general terms ... I think errorfunc is least-squares-specific terminology and I find it confusing for the general Fitter API.

nden · 2014-02-19T13:39:24Z

All good questions.
I would prefer not to have special statistics but it's not clear to me that chi_line can be avoided and it's a useful case. I hope someone can prove me wrong.

nden · 2014-02-19T13:51:39Z

I am in favor of objective_function since this is what's generally used. fit_statistic is easily confused with the statistic functions used within the objective function.

cdeil · 2014-02-19T14:02:13Z

I'm not really clear on what chi_line does.
I presume it's one method to take errors on x into account when fitting a 1-dim model to (x, y) data points.

Some reference I'm aware of:
http://adsabs.harvard.edu/abs/2010arXiv1008.4686H
http://root.cern.ch/root/html/TGraph.html#TGraph:Chisquare

Please add a reference and a test case with a few points for this fit statistic.

Also I would suggest to rename it ... to me "line" sounds like straight line, whereas I think this can handle non-linear functions f(x)?

It's a specific fitter which fits a straight line with uncertainties in x and y. I'm not sure how to generalize this (except may be in the case of a plane, hence z_sigma in the code).
I'll check the references you mentioned.

nden · 2014-02-25T18:31:10Z

@cdeil After mulling over this for a while, I decided it's best to limit this PR to only separating statistics from optimizers and add additional statistics in separate PRs.
Outside this PR I tested that it works with other statistic types so I think it's ready for others to try to add their favourite statistic. I am marking it as ready for final review. There are name changes in this PR so I will add entries in CHANGES.rst before merging after everyone had a chance to change the names again.

nden · 2014-03-01T19:32:41Z

This got stuck for some reason. There are so many changes in upstream/master that even rebasing or manually merging upstream locally does not produce diffs that are easy to review. So I prepared a new branch based on the current trunk. I am wondering if I replace the current branch with the new branch with "git branch -M ..." would Travis pick up the new branch? Or is it better to make a separate PR?

cdeil · 2014-03-02T15:30:36Z

This is not an astropy doc rule, but I would suggest to use ~ only for astropy-internal references
and to leave it off for astropy-external references like the links to the numpy and scipy docs here.

IMO this makes it simpler to know what's what when reading the docs and means less surprises when clicking links.

In this case, the text does say Numpy's lstsq function, so maybe it's ok?

Sure, it's OK! This is a very minor point and probably just my personal preference when reading the astropy html docs. If @nden prefers how it's written now she's very welcome to ignore this comment.

taldcroft · 2014-03-03T18:02:55Z

We should not use import * in examples.

nden · 2014-03-03T19:58:10Z

Ah, I see. The goal was to make it possible to write a statistic without having to modify the objective function, defined in the fitter.

Adding fitter.statistic and fitter.optimizer properties will allow changing them after a fitter is initialized, for example:

fitter = fitting.SimplexLSQFitter()
fitter.statistic = cash

This is easy to add.

It may be good also to add a function (custom_fitter) which takes an optimizer and a statistic and returns a fitter. This will take some more work.

nden · 2014-03-03T20:04:42Z

@taldcroft Another option is to pass the statistic to fitter.__call__ . I prefer the properties though. Do you see any reasons this would be better than having properties?

astrofrog · 2014-04-30T16:27:12Z

@nden @taldcroft @cdeil - just to check, what is the current status of this? I think we had planned to re-apply the changes rather than trying to rebase? If so, would it make sense to do this this week so we can merge it (or have it ready for review) before the feature freeze?

nden · 2014-04-30T16:32:19Z

My understanding is that this, as well as #2149 and #1985 should be closed as they are part of the refactoring of modeling that @embray is doing.

embray · 2014-04-30T20:59:04Z

I would like to keep all of these open as reference--all of them address issues that should be tracked regardless of whether any attached implementations are used as-is.

embray · 2014-05-01T16:10:42Z

Actually, were there still any major lingering issues with this? I know there were some minor niggles but they could be addressed later. In the interest of getting some of this functionality sooner rather than later I can rebase this to merge into the current master if people thing that would be better.

cdeil · 2014-05-01T16:18:47Z

I've completely lost track with the many open astropy.modeling issues and pull requests.
Separating optimisation from fit statistic is in my opinion a big step in the right direction.
So 👍 to merge some of them into master now and then re-evaluate if everything works well or some things need to be adapted.

nden · 2014-05-01T17:30:25Z

Something that needs to be added here is a simple way for a user to change the statistic function (See Tom's comments above). I did some work in this direction but it wasn't completed.

embray · 2014-05-01T22:19:33Z

Okay, I've rebased this on master in a local branch of my own and ensured all the tests are passing on all supported Python versions. I think I'd rather go ahead and merge this as-is, and then build any additional enhancements on top of what's in there.

The only thing this is lacking is a changelog entry, and perhaps also warrants a mention in the 0.4 what's new page.

embray · 2014-05-01T22:20:25Z

Unless you don't want to advertise it too broadly if the feature is considered "incomplete". But there are still API changes that deserve mention.

astrofrog · 2014-05-02T11:00:37Z

I think we can work on the 'what's new' after feature freeze, so I'd just add the CHANGES.rst entry for now, and we can worry about the 'what's new' after.

embray · 2014-05-02T15:15:57Z

Okay...I just worry about things slipping through the cracks with that. I wasn't really following if people were writing that or not as is.

astrofrog · 2014-05-02T15:22:13Z

@embray - I just created #2416. Note that if you want to write it now, you are very welcome to! But if there are enough other things to worry about today for the feature freeze, this can be delayed.

embray · 2014-05-02T15:29:07Z

Okay thanks, sounds good. Will have to go through and find out what else needs to be in there. Need to ping other devs too.

astrofrog · 2014-05-02T15:29:55Z

Will ping some devs now on the gitter channel (since a few are hanging out there)

eteq · 2014-05-26T23:26:39Z

@nden @astrofrog @embray - what's the story with this? Can it be rebased and merged for 0.4? Or do we need to push it off?

astrofrog · 2014-06-10T21:19:08Z

Closed by #2441

cdeil reviewed Feb 19, 2014
View reviewed changes

Comment thread astropy/modeling/statistic.py Outdated

cdeil Feb 19, 2014

Copy link
Copy Markdown

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add chi_line to __all__?

cdeil reviewed Feb 19, 2014
View reviewed changes

This was referenced Feb 19, 2014

allow a modeling parameter to be an array #2049

Closed

WIP: Improve modeling docs how to create user-defined models and fitters #1878

Closed

nden added the ready-for-final-review label Feb 25, 2014

nden added 5 commits February 27, 2014 22:20

made a new PR instead of rebasing

4b60e3e

fix names

806f30f

fix typo

5b39999

PEP 8 and docs changes

3cdc4d2

add example of a user-defined statistic function

7ca89eb

cdeil reviewed Mar 2, 2014
View reviewed changes

taldcroft reviewed Mar 3, 2014
View reviewed changes

Comment thread docs/modeling/new.rst

taldcroft Mar 3, 2014

Copy link
Copy Markdown

Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use import * in examples.

nden mentioned this pull request Mar 25, 2014

Fix/clarify covariance matrix in NonLinearLSQFitter #2181

Merged

nden mentioned this pull request Apr 7, 2014

Move _model_to_fit_params out of ParametricModel #2264

Merged

nden mentioned this pull request May 1, 2014

Add way to set accuracy in SLSQPFitter #2397

Closed

astrofrog mentioned this pull request May 2, 2014

Write 'What's new' entry for 0.4 #2416

Closed

2 tasks

embray mentioned this pull request May 5, 2014

Separate optimization from statistic redux #2441

Merged

astrofrog closed this Jun 10, 2014

embray removed this from the v0.4.0 milestone Jun 10, 2014

embray removed the Ready-for-final-review label Jan 13, 2015

nden deleted the separate-optimization-from-statistic branch April 30, 2018 12:17

nden restored the separate-optimization-from-statistic branch April 30, 2018 12:17

nden deleted the separate-optimization-from-statistic branch January 22, 2026 10:55

Uh oh!

Uh oh!

Conversation

nden commented Dec 18, 2013

Uh oh!

cdeil commented Dec 18, 2013

Uh oh!

nden commented Dec 18, 2013

Uh oh!

nden commented Dec 30, 2013

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

nden commented Feb 19, 2014

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

cdeil Feb 19, 2014

Choose a reason for hiding this comment

Uh oh!

nden commented Feb 19, 2014

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

cdeil commented Feb 19, 2014

Uh oh!

nden commented Feb 19, 2014

Uh oh!

nden commented Feb 19, 2014

Uh oh!

cdeil Feb 19, 2014

Choose a reason for hiding this comment

Uh oh!

nden Feb 19, 2014

Choose a reason for hiding this comment

Uh oh!

nden commented Feb 25, 2014

Uh oh!

nden commented Mar 1, 2014

Uh oh!

cdeil Mar 2, 2014

Choose a reason for hiding this comment

Uh oh!

astrofrog Mar 2, 2014

Choose a reason for hiding this comment

Uh oh!

cdeil Mar 2, 2014

Choose a reason for hiding this comment

Uh oh!

taldcroft Mar 3, 2014

Choose a reason for hiding this comment

Uh oh!

nden commented Mar 3, 2014

Uh oh!

nden commented Mar 3, 2014

Uh oh!

astrofrog commented Apr 30, 2014

Uh oh!

nden commented Apr 30, 2014

Uh oh!

embray commented Apr 30, 2014

Uh oh!

embray commented May 1, 2014

Uh oh!

cdeil commented May 1, 2014

Uh oh!

nden commented May 1, 2014

Uh oh!

embray commented May 1, 2014

Uh oh!

embray commented May 1, 2014

Uh oh!

astrofrog commented May 2, 2014

Uh oh!

embray commented May 2, 2014

Uh oh!