Skip to content

Integrate periodogram algorithms better with TimeSeries object#8599

Merged
eteq merged 18 commits intoastropy:masterfrom
astrofrog:adapt-periodogram-api
Apr 23, 2019
Merged

Integrate periodogram algorithms better with TimeSeries object#8599
eteq merged 18 commits intoastropy:masterfrom
astrofrog:adapt-periodogram-api

Conversation

@astrofrog
Copy link
Member

@astrofrog astrofrog commented Apr 19, 2019

In #8591, I moved the periodogram classes from astropy.stats to astropy.timeseries. In this PR, I want to make sure that users can seamlessly use the periodogram classes given a time series object.

This PR actually does two main things:

  • It adapts the BLS and Lomb-Scargle algorithms so that they can optionally work with absolute rather than relative times. This means that parameters such as transit_time return absolute Time objects, and methods such as e.g. model can be given absolute times to evaluate the models on. I have fully preserved back-compatibility with just using relative times (with or without units). I have added an extensive test for both periodogram algorithms to make sure that they behave correctly when given absolute times.

  • It then adds a periodogram method on the TimeSeries class, which is a convenience to initialize either BoxLeastSquares or LombScargle with the correct arguments. Then instead of doing:

from astropy.stats import BoxLeastSquares
keep = ~np.isnan(ts['sap_flux'])
periodogram = BoxLeastSquares(ts.time.jd[keep] * u.day, ts['sap_flux'][keep])
periodogram.autopower(0.2 * u.day)

users can just do:

periodogram = ts.periodogram(algorithm='bls', algorithm='sap_flux')
results = periodogram.autopower(0.2 * u.day)

I considered an alternative approach which would be to make it so that e.g. BoxLeastSquares can be initialized with a time series, but I decided not to implement this for the following reasons:

  • It doesn't solve the issue that one has to still specify the column names to use for the flux, and there isn't much benefit to writing:
BoxLeastSquares(ts, y='flux', dy='flux_error')

compared with:

BoxLeastSquares(ts.time, y=ts['flux'], dy=ts['flux_error'])
  • If we do this, we essentially are overloading the initializer in a way that it accepts very flexible input (strings or numbers, times, quantities, time series, etc.) and that's not ideal. We could instead have a class method but then that's more to type.

The advantage of just being able to do:

ts.periodogram(algorithm='bls', algorithm='sap_flux')

is that it avoids having to import the periodogram classes and initialize them manually, and provides a nice user-friendly API. This also transparently handles masking out e.g. NaN values before initializing the periodogram classes.

For the binned time series, I'm doing the same as for the sampled one, using the times at the center of the bins (this is mentioned in the docstring)

At the moment there is a little bit of code duplication between binned and sampled, and between BLS and Lomb-Scargle. My priority here is to make sure that things work as we want, and we can always find ways to reduce the duplication in future.

It might be worth writing a top-level docs page specifically about periodograms that then links to the two (or more in future) algorithms we have. Again, in the interest of getting this in to the feature freeze, I'm happy to try and improve the docs in the RC phase, since it would just be docs.

@dfm @mirca @joeldhartman @jakevdp - as the authors for the periodogram classes, I just wanted to give you a heads-up about the changes related to accepting absolute times.

@astrofrog astrofrog added this to the v3.2 milestone Apr 19, 2019
@astrofrog astrofrog requested review from bsipocz, eteq and pllim April 19, 2019 12:16
@astrofrog astrofrog force-pushed the adapt-periodogram-api branch 2 times, most recently from 52c3023 to 12c6d79 Compare April 19, 2019 14:08
@codecov
Copy link

codecov bot commented Apr 19, 2019

Codecov Report

Merging #8599 into master will increase coverage by 0.07%.
The diff coverage is 95.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8599      +/-   ##
==========================================
+ Coverage   86.86%   86.94%   +0.07%     
==========================================
  Files         396      400       +4     
  Lines       58937    59324     +387     
  Branches     1100     1100              
==========================================
+ Hits        51198    51581     +383     
- Misses       7098     7102       +4     
  Partials      641      641
Impacted Files Coverage Δ
astropy/timeseries/periodograms/__init__.py 100% <100%> (ø) ⬆️
astropy/timeseries/periodograms/bls/core.py 92.06% <100%> (+1.07%) ⬆️
...stropy/timeseries/periodograms/lombscargle/core.py 96.17% <100%> (+0.59%) ⬆️
astropy/stats/bls/__init__.py 66.66% <33.33%> (-11.12%) ⬇️
astropy/timeseries/periodograms/base.py 88% <88%> (ø)
astropy/units/function/core.py 97.6% <0%> (-1.56%) ⬇️
astropy/units/function/logarithmic.py 98.48% <0%> (-1.52%) ⬇️
astropy/io/registry.py 95.49% <0%> (-1.51%) ⬇️
astropy/utils/codegen.py 90.9% <0%> (-1.28%) ⬇️
.../coordinates/builtin_frames/ecliptic_transforms.py 83.15% <0%> (-1.01%) ⬇️
... and 39 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ded61e...1900e54. Read the comment docs.

@astropy astropy deleted a comment from codecov bot Apr 19, 2019
@eteq
Copy link
Member

eteq commented Apr 19, 2019

@astrofrog - Quick comment prior to a full review: I am not a fan of the string keywords for arguments. What about instead having it be:

periodogram = ts.periodogram(algorithm=BoxLeastSquares, column='sap_flux')

Or even:

periodogram = BoxLeastSquares.from_times_series(ts, column='sap_flux')

?

I could see adding the "string" keys later on, but I fear locking ourselves into a string based system

@astrofrog
Copy link
Member Author

@eteq - I've gone with your approach of:

periodogram = BoxLeastSquares.from_timeseries(ts, column='sap_flux')

Copy link
Member

@eteq eteq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I'm happy with this now. A few relatively easy-to-implement changes to suggest though (hopefully mainly find-and-replace).

@bsipocz
Copy link
Member

bsipocz commented Apr 19, 2019

I think I'm still slightly in favour of periodogram = ts.periodogram(), as it's slightly better API for the use case I have in mind: a multi object TimeSeries instance that needs to get the periodogram run for all the objects in it.
But then we need to make sure the periodogram classes can handle that, so it's not something for this release.

@astrofrog
Copy link
Member Author

@bsipocz - I personally also quite like the periodogram method, but let's discuss it again once 3.2 is out.

@bsipocz
Copy link
Member

bsipocz commented Apr 19, 2019

Please rebase now that I merged #8591.

@astrofrog astrofrog force-pushed the adapt-periodogram-api branch from 98dbdad to 53ed2aa Compare April 19, 2019 21:59
@eteq
Copy link
Member

eteq commented Apr 19, 2019

I am not as much a fan of ts.periodogram... but we can discuss that later 😉

Copy link
Member

@bsipocz bsipocz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to take it to a spin, and check the html rendering, but here are some minor comments based on the github diff



@pytest.mark.parametrize('timedelta', [False, True])
def test_absolute_times(data, timedelta):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this test function is so similar to the one in BLS, I wonder whether it makes any sense to have only one parametrized for the different periodograms?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some code in common but a lot of differences too in particular with respect to which methods are tested, and how the results are stored, so I'd rather keep it as is.

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc build failed:

/home/circleci/project/docs/whatsnew/3.2.rst:41:Exception occurred in plotting 3-2-1
 from /home/circleci/project/docs/whatsnew/3.2.rst:
Traceback (most recent call last):
  File "/home/circleci/project/venv/lib/python3.6/site-packages/matplotlib/sphinxext/plot_directive.py", line 499, in run_code
    exec(code, ns)
  File "<string>", line 19, in <module>
AttributeError: 'TimeSeries' object has no attribute 'periodogram'

I would have liked to see if the rendered doc look as expected, given some changes in the plot directives and so on.

Also, note for whoever going to merge this, please make sure the remote data test (which is allowed to fail) actually passes for the remote data tests added here.

@astrofrog
Copy link
Member Author

@bsipocz @pllim - this should now be ready!

Copy link
Member

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a non-domain expert. I see the following weirdness in the rendered doc on CircleCI but probably out of scope here because they also occur elsewhere.

https://29818-2081289-gh.circle-artifacts.com/0/home/circleci/project/docs/_build/html/timeseries/index.html#reference-api -- The table listing looks weird; i.e., table looks squashed horizontally.

https://29818-2081289-gh.circle-artifacts.com/0/home/circleci/project/docs/_build/html/timeseries/lombscargle.html -- There are () right before the plots, not sure why. Same with the plot in https://29818-2081289-gh.circle-artifacts.com/0/home/circleci/project/docs/_build/html/whatsnew/3.2.html#whatsnew-3-2-timeseries

@astrofrog
Copy link
Member Author

Yeah I'm not sure what's going on with the Sphinx issues, but we can try and debug over the next couple of weeks. I wonder if it's related to Sphinx 2.0.

@eteq eteq merged commit 798ebb4 into astropy:master Apr 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants