-
Notifications
You must be signed in to change notification settings - Fork 99
feat: frequency add --rank-strategy
#3075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- rename `--rank-ties-strategy` to `--rank-strategy` with `-r` shortcut - optimize rank evaluation so its not inside a hot loop - remove Default impl as we set the default in docopt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements comprehensive ranking strategies for frequency tables, expanding beyond the previous single "min" strategy to include all standard ranking methods (min, max, dense, ordinal, and average) as defined in the Wikipedia ranking article. The implementation follows R's naming conventions for these strategies.
Key Changes:
- Added
--rank-strategyflag with five options: min, max, dense, ordinal, and average - Changed rank data type from
u32tof64to support fractional ranks (average strategy) - Updated rank formatting logic to display integers when possible, decimals when needed
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/cmd/frequency.rs |
Core implementation of RankStrategy enum, ranking algorithms for each strategy, and rank formatting logic |
src/cmd/schema.rs |
Added default RankStrategy::Min to schema command integration |
tests/test_frequency.rs |
Comprehensive test suite covering all ranking strategies, edge cases, and JSON output validation |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Implementing all the strategies listed here
https://en.wikipedia.org/wiki/Ranking
Before, we were only doing Standard Competition Ranking/("min" ranking in R).
Use R naming convention for the different strategies.