Skip to content

dalphakr/dalpha-MLE-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Dalpha MLE Bench Artifact Repo

Workspace Artifact

Agent runs are non-deterministic. The published code is from one representative run. The specific model and hyperparameters may differ across runs, but the overall approach is consistent. Within each run, the agent may either version files (e.g., train.py → train_v2.py) or modify them in place, so versioned files do not necessarily represent the complete iteration history. Only Python source files are included; model weights and submission files are not published.

Five competitions from our runs, including AI4Code:

  • exp-examples/AI4Code
  • exp-examples/imet-2020-fgvc7
  • exp-examples/billion-word-imputation
  • exp-examples/nfl-player-contact-detection
  • exp-examples/uw-madison-gi-tract-image-segmentation

Valid Submission Tool

Define the following tool for use in the agent:

def validate_submission(
    submission_file: str = "submission.csv",
    competition: str | None = None,
) -> dict[str, Any]:
    """Validate a submission CSV file format using MLE-Bench.

    Args:
        submission_file: Name of the submission file in /workspace/ (default: "submission.csv").
                            Examples: "submission.csv", "submission_v2.csv", "ensemble_pred.csv"
        competition: Override competition name (default: uses ctx.competition).

    Returns:
        dict with 'valid' (bool) and 'message' (str).
    """
    submission_path = ctx.workspace_root / submission_file
    if not submission_path.exists():
        return {"valid": False, "message": f"{submission_file} not found in {ctx.workspace_root}"}

    comp = competition or ctx.competition
    local_data_path = SETTING.local_data_path
    if not local_data_path:
        return {"valid": False, "message": "SETTING.local_data_path is empty"}

    workspace_mount_path = "/workspace/submission"
    if ctx.env_type == "conda":
        submission_csv_path = f".volumes/workspace/submission/{submission_file}"
    else:
        submission_csv_path = f"{workspace_mount_path}/{submission_file}"
    command = f"mlebench grade-sample {submission_csv_path} {comp} --data-dir ./zip_files"

    try:
        ctx.mleb_env.check_output(
            entry=command,
            local_path=local_data_path,
            running_extra_volume={
                str(ctx.workspace_root): {"bind": workspace_mount_path, "mode": "ro"},
                str(Path("~/.kaggle").expanduser().absolute()): "/root/.kaggle",
            },
            disable_cache=True,
            disable_chmod=True,
        )
    except Exception as exc:
        return {"valid": False, "message": f"Submission invalid: {exc}"}

    log_event(ctx, "validate_submission", {"submission_file": submission_file, "valid": True})
    return {"valid": True, "message": "Submission is valid."}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages