Skip to content

Conversation

@fsschneider
Copy link
Contributor

@fsschneider fsschneider commented Jul 31, 2024

This PR addresses three main points:

  1. Adds functionality to compute (geometric mean) speedups between two submissions. Across all base workloads, it computes the speedup (runtime_1/runtime_2) and takes the geometric mean across all. If a submission does not reach the target, the time is replaced with max_runtime_budget + 1 second (as if it would have hit the target just after).
  2. Fixes the scoring code to ignore runtimes on workload variants if the base workload couldn't be trained successfully. This addresses "To determine the fastest submission on a held-out workload, we only consider submissions that reached the target on the corresponding fixed workload. This protects us against extremely fast submissions that only work on a specific held-out workload and are useless as general algorithms." in point 4 of https://github.com/mlcommons/algorithmic-efficiency/blob/main/DOCUMENTATION.md#using-held-out-workloads-in-scoring. It did not affect the results for the given logs.
  3. Fixes the max_tau to 4.0, following our rules: https://github.com/mlcommons/algorithmic-efficiency/blob/main/DOCUMENTATION.md#integrating-performance-profiles-for-the-benchmark-score

Geometric means across individual workload speedups between two algorithms.
@fsschneider fsschneider requested a review from a team as a code owner July 31, 2024 14:32
@github-actions
Copy link

github-actions bot commented Jul 31, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@priyakasimbeg priyakasimbeg merged commit c465e25 into mlcommons:dev Jul 31, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jul 31, 2024
@fsschneider fsschneider deleted the scoring_QoL branch December 20, 2024 11:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants