-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
To calculate standard deviation of an array or chunked array.
I would prefer two pass algorithm [1] as a balance of numerical stability and performance. Numpy uses this method to calculate variance [2].
Welford's online algorithm [3] is more stable, but also more expensive in computation.
[1] https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Two-pass_algorithm
[2] https://github.com/numpy/numpy/blob/92ebe1e9a6aeb47a881a1226b08218175776f9ea/numpy/core/_methods.py#L176
[3] https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm
Reporter: Yibo Cai / @cyb70289
Assignee: Yibo Cai / @cyb70289
Related issues:
- [C++] Incremental Variance, Standard Deviation aggregators (is duplicated by)
PRs and other links:
Note: This issue was originally created as ARROW-10070. Please see the migration documentation for further details.