RMDC2026 Data

The data created for the Roman Microlensing Data Challenge 2026 (RMDC2026) is intended to be a semi-realistic representation of the microlensing data volume and type expected from the Roman Galactic Bulge Time Domain Survey.

It should be noted that in the simulated data, the inertial frame of reference was defined with the $x$-axis increasing from the binary center of mass towards the less massive lens at t0, the time of closest approach to the center of mass. If viewed from the solar system barycenter, the inertial frame moves at the relative velocity vlens_CoM - vobserver(t0). The inclination of the orbit is a counter-clockwise rotation about the $x$-axis. $lpha$ is the angle that the source trajectory made with the $x$-axis (if parallax was 0). Where finite source effects were significant, a linear limb darkening law was applied.

Nexus mounted data

Chanllenge data is hosted on the Nexus, for easy access, at /data/data-challenge/rges/RMDC26_Beginner_Tier.parquet. Refer to this notebook for examples of how to access the challenge data, on the Nexus. See the README.md for dataset specific details.

Hugging Face

There are two challenge datasets available for download from the Hugging Face Hub: Beginner and Experienced. Each includes challenge data for its tier (challenge.csv). The Experienced The organization will also be updated with labeled training data for machine-learning purposes (train.csv), with roughly an order of magnitude more events than the challenge set (~100,000 events), as it is generated.

Below are instructions for downloading the Beginner dataset. Replace “Beginner” with “Experienced” to download the experienced dataset, which includes more lens arrangements, higher-order effects, and labeled training data.

Download Instructions

CLI:

hf download RGES-PIT/Beginner --repo-type dataset

Python:

from huggingface_hub import hf_hub_download
import pandas as pd

REPO_ID = "RGES-PIT/Beginner"
FILENAME = "RMDC26_Beginner_Tier_test.parquet" # "_test" indicate to HF that this is a test set instead of a training set

dataset = pd.read_csv(
    hf_hub_download(
        repo_id=REPO_ID, 
        filename=FILENAME, 
        repo_type="dataset")
    )

Git:

git lfs install

# git clone git@hf.co:datasets/<dataset ID>
git clone git@hf.co:datasets/RGES-PIT/Beginner