Skip to content

ENH: Adding Transit Periodogram to astropy.stats#7391

Merged
bsipocz merged 13 commits intoastropy:masterfrom
dfm:transit-periodogram
Aug 13, 2018
Merged

ENH: Adding Transit Periodogram to astropy.stats#7391
bsipocz merged 13 commits intoastropy:masterfrom
dfm:transit-periodogram

Conversation

@dfm
Copy link
Contributor

@dfm dfm commented Apr 18, 2018

Hi all,

This pull request adds a new feature that I've been working on for a while now with contributions from @mirca and @joeldhartman. This feature is another popular astronomy-specific periodogram like Lomb-Scargle, but this one is used to find transiting exoplanets and eclipsing binary systems by using a top-hat basis instead of sines. For historical reasons, this algorithm is often referred to as "box least squares", but I think that the more descriptive "transit periodogram" name is better.

This algorithm was proposed by Kovács et al. (2002) and the method has since become the standard method of detecting transiting planets. In many cases, this is achieved by wrapping the Fortran code released by those authors (many people use f2py bindings that I wrote almost 5 years ago). With K2 data continuing to roll in and the launch of TESS (hopefully today! 🤞) it seems like it would be timely to have an implementation of this algorithm within AstroPy.

With this in mind, my collaborators and I have written an efficient implementation of this algorithm in C with Cython bindings that expose an interface that will be familiar for anyone who uses the LombScargle class in astropy.stats.

A build of the documentation for this pull request can be found here. I have also written up a tutorial that uses this code to detect planets using data from NASA's K2 Mission that can be found here. I expect that something like this would eventually be submitted to astropy-tutorials, but I thought that it might be useful to have it here for now.

I have discussed this pull request with several AstroPy maintainers (e.g. @eteq and @adrn) and they endorsed the idea of submitting this to core, but if the community would prefer to keep this as a separate package, I'm certainly open to that idea. Either way, I would definitely appreciate any feedback.

Thanks!

@astropy-bot
Copy link

astropy-bot bot commented Apr 18, 2018

Hi there @dfm 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

Everything looks good from my point of view! 👍

If there are any issues with this message, please report them here.

@pllim pllim added this to the v3.1 milestone Apr 18, 2018
@pllim
Copy link
Member

pllim commented Apr 18, 2018

Probably need a rebase to get rid of astropy-helpers diff. Need a change log. Might need some squashing too.

@pllim
Copy link
Member

pllim commented Apr 18, 2018

p.s. Usually tests are in their own tests sub-directory, separated from code.

@dfm
Copy link
Contributor Author

dfm commented Apr 18, 2018

@pllim: thanks for the suggestions! I hear you on the squashing :-) I figured that I would squash after getting feedback, but I'm happy to do it now if you'd prefer. I've moved the tests, added a changelog entry, and (hopefully) fixed the compilation bug with old C compilers.

I'm not actually sure how to fix the astropy helpers issue. I tried rebasing and it says that everything is up to date. Any tips? Thanks!

@pllim
Copy link
Member

pllim commented Apr 18, 2018

@dfm did you do a git submodule update?

@pllim
Copy link
Member

pllim commented Apr 18, 2018

Also saw this error in 32-bit build:

astropy/stats/transit_periodogram/transit_periodogram.c: In function ‘run_transit_periodogram’:
astropy/stats/transit_periodogram/transit_periodogram.c:169:30: error: ‘INFINITY’ undeclared (first use in this function)
         best_objective[p] = -INFINITY;
                              ^
astropy/stats/transit_periodogram/transit_periodogram.c:169:30: note: each undeclared identifier is reported only once for each function it appears in
error: command 'gcc' failed with exit status 1

@dfm
Copy link
Contributor Author

dfm commented Apr 18, 2018

I hadn't done that, but I did previously rebase. When I do a git submodule update now, it doesn't seem to do anything... submodules have always confused me :-\ sorry!

@pllim
Copy link
Member

pllim commented Apr 18, 2018

I suspect whatever you rebased against was also outdated. Try this...

If you don't have a remote pointing to astropy/astropy already, do this (if SSH) or otherwise change the remote URL to use HTTPS:

git remote add upstream git@github.com:astropy/astropy.git 

Then

git fetch upstream master
git rebase upstream/master
git submodule update
git status

I did something similar with your branch here and I get something like:

$ git status
On branch pr7391
Your branch and 'dfm/transit-periodogram' have diverged,
and have 59 and 51 different commits each, respectively.

Theoretically I could force push to your PR here, but it would mess up the author listing, so I will refrain.

@dfm
Copy link
Contributor Author

dfm commented Apr 18, 2018

I tried running those commands and still no luck - very strange! There's a doctest failure on the 32bit build that I'm also going to need to track down, but I'm done for the day. I'll check back tomorrow!

Thanks again for your help.

@crawfordsm
Copy link
Member

Thanks @dfm for the PR! I think it works well with some of the other functionality in the stats package and also would be something useful for the community.

I will mention that this is a large and complex PR so it might take awhile to provide a full review of it, but looking forward to working with you on this.

@bsipocz
Copy link
Member

bsipocz commented Apr 19, 2018

@dfm - This is an awesome addition, and I'm sure a large number of people will appreciate to have this readily available in astropy.

However, I would argue that keeping the name BLS is preferable rather than a more generic term. It's a widely known name for it, and makes it unambiguous which algorithm is implemented leaving room for adding other ones later (though probably not to astropy core but a separate package).

@crawfordsm - I'm happy to help out in the review on the python side, though that it may indeed take some time.

@dfm
Copy link
Contributor Author

dfm commented Apr 19, 2018

@crawfordsm @bsipocz Thanks! I'm looking forward to working with y'all to get this into good shape.

Re: naming - I've always disliked the name BLS because I think it's a bit misleading, but I'm willing to defer and you're probably right that sticking with the historical name is the right way to go... it's also fewer characters :-)

@bsipocz
Copy link
Member

bsipocz commented Apr 19, 2018

@dfm @pllim - getting rid of the helpers update should be easy. You do an interactive rebase, and remove the commit "Rebase ab0b750" from the list in the editor as that seem to be the only one touching the helpers.

@dfm
Copy link
Contributor Author

dfm commented Apr 19, 2018

@bsipocz: thanks! It looks like I fixed the helpers bit.

@mhvk
Copy link
Contributor

mhvk commented Apr 22, 2018

This looks great in itself, but should it really be in astropy proper? So far, we've not included anything that was this specific to a sub-field of astronomy. It would seem to make much more sense as an affiliated package, since it is logically combined with also fitting transit lightcurves, including limb darkening, etc.

@crawfordsm
Copy link
Member

@mhvk With the support expressed by multiple maintainers, that it is based on highly cited work, and that exoplanets is current a major field of astronomy, I would be inclined to include it. It also seems like it might be applicable to other fields.

@pllim
Copy link
Member

pllim commented Apr 24, 2018

Just for consideration: If this has any chance to be taken out and put in another affiliated package in the future, we should think carefully. Does benefit outweigh the risk? Will this stay forever? Should this find a home instead in an exoplanet specific library? Lesson learned from conesearch, which is now in astroquery. 😉

That said, I have no personal objection.

@bsipocz
Copy link
Member

bsipocz commented Apr 24, 2018

I feel that while somewhat specialized, this belongs to stats more than in an exoplanet library. While my experiences getting rather obsolete (or just the field is moving too fast), BLS was typically run along with LS and other stats, sharing the same technical issues, etc while it shares less with more detailed, individual transit modeling.

Currently stats is a bit of a random pile of different algorithms anyway, and we could have already argued for a few of those whether they belong here or not (not to mention cosmology; I know why we keep it in core but one could use the same argument that it's only useful for a subset of astronomers and yet not infrastructure, so it should belong to a separate package).

Maybe the future refactoring would be to factor out parts of stats into a stat package once it grows disproportionally big?

@mhvk
Copy link
Contributor

mhvk commented Apr 24, 2018

@crawfordsm, @bsipocz - there are many major fields in astronomy, and so far astropy has only included things that benefit more than a single field. I'd certainly argue this is the case for cosmology, which really is more like coordinates and time, providing standard tools to map redshift to distance, time, etc. The analogy that could be drawn would be with a cosmology-focussed PR that, e.g., adds statistical analysis tools for CMB fluctuations or growth of structure, etc. That would serve at least as many people as this PR, but would be, in my opinion, similarly too narrow for inclusion in the core.

p.s. I do agree that stats, like convolution is an odd module.

@mhvk
Copy link
Contributor

mhvk commented Apr 24, 2018

I think at this stage it might be best to just ask the coordination committee to pipe in where exactly the line should be drawn between sufficiently general and too specialized, so cc @taldcroft, @kelle, @astrofrog, @eteq.

p.s. Just to be absolutely sure, this is not in the least a comment on the code/functionality: it looks great. I just happen to think it is an excellent start for an affiliated package for exoplanet transit analysis.

@larrybradley
Copy link
Member

I'm inclined to include this in astropy.stats, especially since it's more applicable than just exoplanets (e.g. eclipsing binary systems).

@pllim pllim requested a review from bsipocz May 1, 2018 13:51
@bmorris3
Copy link
Contributor

bmorris3 commented May 1, 2018

Hi @dfm! This PR has been brought up at #pyastro18 as something I could help review.

To start – is there a reason you call it TransitPeriodogram rather than BoxLeastSquaresPeriodogram (for parity with LombScarglePeriodogram) or BLSPeriodogram? Since this is getting dropped into the stats module, I think it would be appropriate to name it for the technique rather than the application. I understand if you have reservations calling it BLS because your implementation is not identical to the original Kovacs implementation, but the spirit of "box-least squares" is preserved in your implementation, so I don't see a reason not to refer to it by name. Especially because I think your method will be more discoverable if it's referred to by its algorithmic name. I would search for "BLS".

...and if you wanted to name it for the application, calling it EclipsePeriodogram might be the more generic name (let's not forget our EB friends!).

@StuartLittlefair
Copy link
Contributor

StuartLittlefair commented May 1, 2018

Great to see an efficient implementation of this algorithm in Cython!

Just to chime in on the issue of where this sits, the algorithm is widely used outside of exoplanet science. To me that suggests a place in astropy.stats, along the lines of the Lomb Scargle Periodogram.

If there was ever a case to move it to an affiliated package, it would probably be to a generalised time-series analysis package such as stingray. But then the LS periodogram was not put there, so presumably this should not be for similar reasons?

@dfm
Copy link
Contributor Author

dfm commented May 1, 2018

Hi all, Sorry for taking so long to weigh in - I've been traveling (of course)!

I'm totally down with changing the name. BLS or BLSPeriodogram are probably my preferred choices, but I'm easy. As for whether or not to include this in core, I don't think that I can add much beyond what other people have already said. Just to re-iterate: I think that this is a core piece of technology for pipelines in several fields and a well-established algorithm. Something like transit fitting is much less well established and that would definitely make more sense as an affiliated package, but I don't think that this would go there. All this being said, I'm happy to defer if the community thinks that this isn't a good fit!

@pllim
Copy link
Member

pllim commented Aug 6, 2018

Still need to get rid of astropy_helpers changes. FYI

@dfm
Copy link
Contributor Author

dfm commented Aug 6, 2018

@pllim not "still"... "again" :-) I'm bad a computers sometimes. Thanks!

@bsipocz
Copy link
Member

bsipocz commented Aug 6, 2018

@dfm - I'm sorry, I missed it totally today that there were still helpers issues.

@dfm
Copy link
Contributor Author

dfm commented Aug 7, 2018

@bsipocz - I actually managed to mess this up after my rebase... I forgot to run git submodule update before git commit -a. Totally my bad!

@bsipocz
Copy link
Member

bsipocz commented Aug 7, 2018

@dfm - Ohh, the trick is not to use -a ever ;)

@gully
Copy link

gully commented Aug 7, 2018

I experimented with this tutorial today. Looks good to me, everything worked well with the tutorial. I found a teeny-tiny bit of missing documentation:

  • depth_half appears in the dictionary of results of compute_stats() but does not have an affiliated entry in the documentation
  • ⬆️ same with depth_phased

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very much in favour of getting this in now to get as much user feedback as possible rather than leave it to last minute before feature freeze.

@larrybradley
Copy link
Member

I'm not going to be able to review this soon. I'll leave it to @adrn and @bsipocz.

@pllim
Copy link
Member

pllim commented Aug 10, 2018

Looks like it still needs a change log (if there was one before, it is gone now). And is the coverage decrease expected?

@bsipocz
Copy link
Member

bsipocz commented Aug 10, 2018

@pllim - coveralls is useless with the cython coverage, it's quite unbeliavable all but one file have >90% coverage, but that one is 30%. I'll try to have a look at it (but can't give an ETA, so anyone is more than welcome to have a double check).

@crawfordsm
Copy link
Member

Giving it a final review, and everything does look great. It still needs a changelog entry and could use the mention of the license # Licensed under a 3-clause BSD style license - see LICENSE.rst at the top of the python files. Happy to merge this once those are added.

The low coverage tests does seem to be with the cython code though the file that it mainly wraps around (bls.c) reports high coverage.

@dfm
Copy link
Contributor Author

dfm commented Aug 11, 2018

Hey folks, thanks for the feedback! I've added a changelog entry and the license comments to the headers. I'm also confused about the coverage issues - the file that's bringing down the coverage fraction is auto-generated by Cython. Is there something that I should do to mark that as something that should be skipped? Thanks again!

@bsipocz
Copy link
Member

bsipocz commented Aug 13, 2018

I'm go ahead and merge this now. Please try to get users testing it, so any fix needed can be done before it goes into the actual release.

@bsipocz bsipocz merged commit 61c1c58 into astropy:master Aug 13, 2018
@bsipocz
Copy link
Member

bsipocz commented Aug 13, 2018

Thank you @dfm!

@dfm
Copy link
Contributor Author

dfm commented Aug 13, 2018

Thanks @bsipocz and everyone else who gave feedback. It has been great experience contributing to astropy for the first time and I really appreciate how friendly and supportive you have all been!

Also: huge thanks to @mirca who worked with me to develop the initial prototype for this pull request!

@bsipocz
Copy link
Member

bsipocz commented Aug 13, 2018

@mirca - Retrospect I've noticed that the rebasing and squashing in this PR meant that your commits has been squashed into other ones and thus your author credit has disappeared from the history. I'm truly, terribly sorry about this.
Unfortunately we can't easily fix this in master as we have a strict no-history-rewrite policy there. Could you please @mirca let us know how to proceed, whether you would like us to explore options further?

@mirca
Copy link
Member

mirca commented Aug 13, 2018

@bsipocz that's really kind of you, thanks! But absolutely no worries about that. @dfm already gave me way more credit than I definitely deserved :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.