Skip to content

Scoring logging bug 🐛 - incorrect computation of time to target on val (s) in get_summary_df #791

@Niccolo-Ajroldi

Description

@Niccolo-Ajroldi

Description

In scoring/score_submissions.py, the function get_summary_df is responsible for gathering evaluation statistics from submission logs into a DataFrame. When scoring a submission, this function is invoked on every workload, and the resulting concatenated DataFrames are saved as CSV files (<submission>_summary.csv ).

The current implementation computes the time needed to reach the validation target as follows:

summary_df['time to target on val (s)'] = summary_df.apply(
lambda x: x['time to best eval on val (s)']
if x['val target reached'] else np.inf,
axis=1)

This results in a time to target on val (s) equal to time to best eval on val (s) if a submission reaches the target. However, usually the time to the validation target is usually lower than the time to best eval score.

Performance profiles are not affected

Fortunately, this bug does not affect the performance profiles, nor the final scores. Despite the concatenated DataFrames are used to compute the performance profiles, fortunately, we ignore the existing time to target on val (s) column and perform instead a correct computation of the time to eval target.

Source or Possible Fix

I have implemented a fix in #792. The final scores and the performance profiles are unaffected after the fix. However, <submission>_summary.csv changes drastically. Here is an example on two workloads for the prize qualification baseline algorithm (first study):

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions