Skip to content

[C++][Compute] Implement stdev aggregate kernel #26089

@asfimport

Description

@asfimport

To calculate standard deviation of an array or chunked array.

I would prefer two pass algorithm [1] as a balance of numerical stability and performance. Numpy uses this method to calculate variance [2].
Welford's online algorithm [3] is more stable, but also more expensive in computation.

[1] https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Two-pass_algorithm
[2] https://github.com/numpy/numpy/blob/92ebe1e9a6aeb47a881a1226b08218175776f9ea/numpy/core/_methods.py#L176
[3] https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm

Reporter: Yibo Cai / @cyb70289
Assignee: Yibo Cai / @cyb70289

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-10070. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions