-
Notifications
You must be signed in to change notification settings - Fork 99
feat: moarstats add Atkinson Index with configurable inequality aversion parameter, Normalized Entropy & Bimodal Coefficient
#3243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ormalized Entropy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds three new advanced statistics to the moarstats command: Atkinson Index with a configurable inequality aversion parameter, Normalized Entropy, and Bimodal Coefficient. These statistics provide deeper insights into data distribution characteristics, with the Atkinson Index offering a more general inequality measure than the Gini coefficient through its configurable epsilon parameter.
Key changes:
- Adds Atkinson Index computation with configurable epsilon parameter (default 1.0) via
--epsilonflag - Adds Bimodality Coefficient to detect whether distributions have single or multiple modes
- Adds Normalized Entropy as a [0,1]-scaled version of Shannon Entropy
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 12 comments.
| File | Description |
|---|---|
| src/cmd/moarstats.rs | Core implementation: adds computation functions for the three new statistics, refactors struct/function names from "KurtosisGini" to "KGA" (Kurtosis-Gini-Atkinson), adds epsilon parameter validation, and integrates new statistics into the processing pipeline |
| tests/test_moarstats.rs | Updates test suite with new test moarstats_advanced_atkinson_epsilon to verify Atkinson Index with custom epsilon value, and updates expected output to include new statistics columns |
| docs/STATS_DEFINITIONS.md | Documents the three new statistics with formulas, value ranges, and Wikipedia references |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.