-
Notifications
You must be signed in to change notification settings - Fork 76
Description
Description
In scoring/score_submissions.py, the function get_summary_df is responsible for gathering evaluation statistics from submission logs into a DataFrame. When scoring a submission, this function is invoked on every workload, and the resulting concatenated DataFrames are saved as CSV files (<submission>_summary.csv ).
The current implementation computes the time needed to reach the validation target as follows:
algorithmic-efficiency/scoring/score_submissions.py
Lines 91 to 94 in a23b5ea
| summary_df['time to target on val (s)'] = summary_df.apply( | |
| lambda x: x['time to best eval on val (s)'] | |
| if x['val target reached'] else np.inf, | |
| axis=1) |
This results in a time to target on val (s) equal to time to best eval on val (s) if a submission reaches the target. However, usually the time to the validation target is usually lower than the time to best eval score.
Performance profiles are not affected
Fortunately, this bug does not affect the performance profiles, nor the final scores. Despite the concatenated DataFrames are used to compute the performance profiles, fortunately, we ignore the existing time to target on val (s) column and perform instead a correct computation of the time to eval target.
Source or Possible Fix
I have implemented a fix in #792. The final scores and the performance profiles are unaffected after the fix. However, <submission>_summary.csv changes drastically. Here is an example on two workloads for the prize qualification baseline algorithm (first study):
