This repository presents benchmarks for evaluating Bayesian Optimization (BO) methods in high-dimensional, real-world urban mobility settings — specifically, origin-destination (OD) estimation in traffic simulations.
- Simulation backend: SUMO (Simulation of Urban Mobility)
- Task: Estimate OD demand that minimizes the gap between simulated and ground-truth traffic sensor data
- Scope: Includes multiple network scales and four optimization strategies
Benchmark Networks 🛣️
The benchmark includes five traffic network scenarios of increasing complexity:
- 1ramp: A minimal linear freeway with a single on-/off-ramp — ideal for validating model logic.
- 2corridor: A one-way corridor with multiple ramps and OD overlaps — tests flow propagation handling.
- 3junction: A freeway junction with intersecting flows — tests handling of merging, congestion, and flow splits.
- 4smallRegion: A compact region with multiple corridors and junctions — emphasizes spatial interaction.
- 5fullRegion: A full-scale metropolitan freeway system — challenges scalability and generalization.
Optimization Methods 🎯
This benchmark supports the following OD estimation strategies:
- SPSA: Simultaneous Perturbation Stochastic Approximation
- Vanilla BO: Bayesian Optimization with standard GP
- SAASBO: Sparse Axis-Aligned Subspace Bayesian Optimization
- TuRBO: Trust Region Bayesian Optimization
Data repository
The dataset is also included in this repository: https://github.com/UMN-Choi-Lab/BO4Mob_dataset
You can install and run this project using one of the following environments:
🐳 Docker (Recommended)
This is the easiest and most reproducible way to run the benchmark.-
Pull the Docker image from Docker Hub
docker pull choisumn/bo4mob
-
Run the Docker container
docker run -it --name container_name choisumn/bo4mob
Attach to the running container and go to the
/appdirectory. -
Initialize submodules (if needed)
When you run the docker container, SUMO is automatically installed. If the submodule does not work, run the following code:
git submodule init git submodule update
-
Initialize submodules (if needed)
You can verify the default installation path by running:
which sumo
This should return the path
/opt/sumo-1.12/bin/sumo.
🐧 Linux (Manual setup)
Follow this guide if you want to run outside Docker on Ubuntu or similar distributions.- This guide outlines how to set up the environment and build the project on Ubuntu 22.04 with Python 3.10.
-
System update & Python installation
sudo apt update && sudo apt upgrade -y sudo apt install -y python3 python3-pip -
Install required system libraries
sudo apt install -y \ bash \ curl \ vim \ git \ cmake \ g++ \ build-essential \ libxerces-c-dev \ libfox-1.6-dev \ libgdal-dev \ libproj-dev \ libgl2ps-dev \ libsqlite3-dev \ python3-dev -
Clone the project repository
git clone https://github.com/UMN-Choi-Lab/BO4Mob.git cd bo-urbanmobility-test -
Install Python dependencies
pip install --no-cache-dir -r requirements.txt
-
Install and build SUMO
- Initialize submodules:
git submodule init git submodule update
- Build SUMO:
cd sumo/build cmake -DCMAKE_INSTALL_PREFIX=/opt/sumo-1.12 .. make -j$(nproc) sudo make install
- Set environment variables:
echo 'export SUMO_HOME=/opt/sumo-1.12/share/sumo' >> ~/.bashrc echo 'export PATH=$PATH:/opt/sumo-1.12/bin' >> ~/.bashrc source ~/.bashrc
- (Optional) If python Command Is Not Recognized
- If python is not mapped to python3, create the alias manually:
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 1
- ✅ Done
- You can now run and test your project. To verify SUMO installation:
This should return the path
which sumo
/opt/sumo-1.12/bin/sumo.
🪟 Windows (Manual setup)
Follow these steps to set up and run the benchmark on Windows without using Docker.- Check Your Python Version
- Make sure you have Python 3.10 installed.
- To verify your Python version, run:
python --version
- Download SUMO Version 1.12.0
- Visit https://sumo.dlr.de/releases/1.12.0/
- Download either
sumo-win64-1.12.0.msi - During installation:
- Note the installation path (e.g.,
C:\Program Files (x86)\Eclipse\Sumo\) - Make sure to check the option: "Set SUMO_HOME and adapt PATH and PYTHONPATH"
- Note the installation path (e.g.,
-
Clone the repository
git clone https://github.com/UMN-Choi-Lab/BO4Mob.git
-
Install Python dependencies
-
Navigate to the project folder and install required packages:
pip install -r requirements.txt
- Run a code
- You're now ready to run the code within the cloned repository.
Click to expand directory overview
src/: Source code for running simulations and optimization (includes models, simulation logic, and optimization strategies)output/: Simulation and optimization results, including logs, figures, and route filesnetwork/: SUMO network files (net.xml,taz.xml,od.xml, etc.) for each scenariosensor_data/: Ground-truth sensor measurements used for evaluationconfig/: JSON configuration files defining each experiment setupod_for_single_run/: OD vectors for single-run simulations and input templatesvisualization/: Tools for SUMO GUI-based visualization and analysis notebooksrequirements.txt: List of required Python packagesREADME.md: Main documentation for the repository
This project supports two main execution modes:
Single OD Run — Run a simulation using a manually defined OD vector and evaluate the result by comparing it to ground-truth sensor measurements.
This mode runs a simulation using a manually defined OD vector (x) and compares the result with ground-truth measurements. Useful for baseline evaluation and sanity checks.
--network_name: One of["1ramp", "2corridor", "3junction", "4smallRegion", "5fullRegion"]--date: Integer representing the simulation date inyymmddformat (e.g.,221014for October 14, 2022); one of221008-221021--hour: Time window for simulation inHH-HHformat, where the first value is the start hour and the second is the end hour (e.g.,08-09means from 08:00 to 09:00); one of["06-07", "08-09", "17-18"]--eval_measure: (optional) Link-level measurement used to evaluate the simulation against ground-truth sensor data; choose betweencount(default) for vehicle counts, orspeedfor vehicle speeds (mph)--routes_per_od: (optional) Type of routes to use for the simulation; choose betweensingle(default) for one representative route per OD pair, ormultiplefor multiple precomputed routes per OD pair--od_values: (Only for1ramp) Three integer OD values as direct input, e.g.,--od_values 2092 609 386-
The OD values must be given in the following order: (1) taz_0 → taz_1, (2) taz_0 → taz_49, (3) taz_49 -> taz_1
-
The appropriate range for each OD value depends on spatiotemporal characteristics. For weekday morning peak hours, values up to 2500 per OD are recommended for the 1ramp network.
-
--od_csv: CSV file with aflowcolumn containing OD values (e.g.,od_1ramp.csvinod_for_single_run/)--launch_gui: (optional) If provided, launches SUMO GUI after the simulation is completed
-
Check ground-truth sensor data
Confirm that the following file exists:
sensor_data/{date}/gt_link_data_{network_name}_{date}_{hour}.csvThis file is required to evaluate the simulation by comparing it against real sensor flow data.
-
Provide OD input and run the simulation
Depending on the network, you can either input OD values directly or use a CSV file located in
od_for_single_run/.Use the following command structure:
python src/single_od_run.py \ --network_name ${NETWORK_NAME} \ --date ${DATE} \ --hour ${HOUR} \ [--eval_measure ${EVAL_MEASURE}] \ [--routes_per_od ${ROUTES_PER_OD}] \ (--od_values V1 V2 V3) | (--od_csv ${CSV_FILENAME}) \ [--launch_gui]
Example
-
For
1rampwith direct OD values (3 OD pairs):python src/single_od_run.py --network_name 1ramp --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --od_values 2092 609 386 --launch_gui
-
For
1rampwith CSV inputpython src/single_od_run.py --network_name 1ramp --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --od_csv od_1ramp.csv --launch_gui
-
For other networks with CSV input (
2corridor,3junction, etc.):python src/single_od_run.py --network_name 2corridor --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --od_csv od_2corridor.csv --launch_gui
The script automatically looks for the file in
od_for_single_run/, so you only need to provide the file name. -
-
Check the results
Simulation outputs will be saved to the
output/single_od_run/directory, with a folder name determined by your input:-
If you used
--od_values 2092 609 386, the folder will be:output/single_od_run/221014_08-09_count_multiple_od_2092-609-386_values/ -
If you used
--od_csv od_1ramp.csv, the folder will be:output/single_od_run/221014_08-09_count_multiple_od_1ramp_csv/
This folder will contain:
simulation/: SUMO route, OD, and link flow outputsresult/: Evaluation metrics (e.g., NRMSE)figs/: Visualizations (e.g., link flow comparison, OD bar plots)
-
- This mode does not perform optimization — it simply evaluates a fixed OD input through one simulation.
- For large networks, simulation time and memory usage may increase significantly.
- Either --od_values or --od_csv must be provided. Supplying both or neither will result in an error.
Full Optimization — Estimate OD using an optimization algorithm and evaluate performance against sensor data.
This mode runs an initial search followed by model-based optimization (if specified), then compares simulated link flows with ground-truth sensor data to evaluate performance.
--network_name: One of["1ramp", "2corridor", "3junction", "4smallRegion", "5fullRegion"]--model_name: Optimization model to run, one of["initSearch", "spsa", "vanillabo", "saasbo", "turbo"]--kernel: GP kernel type used in the BO model, one of["matern-1p5", "matern-2p5", "rbf", "none"]--date: Integer representing the simulation date inyymmddformat (e.g.,221014for October 14, 2022); one of221008-221021--hour: Time window for simulation inHH-HHformat, where the first value is the start hour and the second is the end hour (e.g.,08-09means from 08:00 to 09:00); one of["06-07", "08-09", "17-18"]--eval_measure: (optional) Link-level measurement used to evaluate the simulation against ground-truth sensor data; choose betweencount(default) for vehicle counts, orspeedfor vehicle speeds (mph)--routes_per_od: (optional) Type of routes to use for the simulation; choose betweensingle(default) for one representative route per OD pair, ormultiplefor multiple precomputed routes per OD pair--seed: Random seed for reproducibility (must be 1- or 2-digit integer)--cpu_max: Number of CPU cores to use for parallel simulation
-
Check ground-truth sensor data
Make sure the sensor data exists in:
sensor_data/{date}/gt_link_data_{network_name}_{date}_{hour}.csvThis is required for computing the evaluation loss (e.g., NRMSE).
-
Run the optimization
Use the following command structure:
python src/full_optimization.py \ --network_name ${NETWORK_NAME} \ --model_name ${MODEL_NAME} \ --kernel ${MODEL_NAME} \ --date ${DATE} \ --hour ${HOUR} \ [--eval_measure ${EVAL_MEASURE}] \ [--routes_per_od ${ROUTES_PER_OD}] \ --seed ${SEED} \ [--cpu_max ${NUM_CORES}]
Example
- Run full optimization with BO:
python src/full_optimization.py --network_name 1ramp --model_name vanillabo --kernel matern-2p5 --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --seed 1 - Run only initial search phase (no model optimization):
python src/full_optimization.py --network_name 1ramp --model_name initSearch --kernel none --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --seed 1
If
model_nameis set to a model (e.g.,vanillabo), it will:- First run the initial search phase.
- Then proceed with the selected model optimization.
Alternatively, you can run
--model_name initSearchonly if you want to generate initial data without optimization. - Run full optimization with BO:
-
Check the results
Optimization results will be saved under:
output/full_optimization/network_{network_name}_{model_name}_{kernel}_{date}_{hour}_{eval_measure}_{routes_per_od}_seed-{seed}/Inside you'll find:
simulation/: Route, OD, and link flow files across iterationsresult/: Evaluation metrics (e.g., NRMSE, run time)figs/: Convergence plots, link flow comparisons
- The available GP kernels depend on the optimization model:
vanillaboandturbosupport"matern-1p5","matern-2p5", and"rbf";
saasbosupports only"matern-2p5";
spsaandinitSearchuse"none"(no GP kernel). - If the initial search has already been completed for the same seed/config, only the model optimization will run.
- Some large networks (e.g.,
4smallRegion,5fullRegion) may require significant memory and CPU resources. Make sure your machine meets the requirements. - You can limit CPU usage using the
--cpu_maxargument to avoid system overload.
Visualize SUMO Simulation — Replay simulation results in SUMO GUI
Use this script to visually inspect the simulation results using the SUMO GUI. It works for both full optimization and single OD run experiments.
-
--mode: One of["single_od_run", "full_optimization"] -
--network_name: Name of the network scenario (e.g.,1ramp) -
--date: Simulation date inyymmddformat (e.g.,221014) -
--hour: Time window of the experiment (e.g.,08-09) -
--eval_measure: Link-level measurement used for evaluation; choose betweencount(default) for vehicle counts, orspeedfor vehicle speeds (mph) -
--routes_per_od: Type of routes to use for the simulation; choose betweensingle(default) for one representative route per OD pair, ormultiplefor multiple precomputed routes per OD pair -
--overwrite: If set, overwrites the original route file after sorting by vehicle departure time -
--od_input: (required for single_od_run) Folder identifier, e.g.,od_1ramp_csvorod_2092_609_386_valuesOnly for
full_optimizationmode:--model_name: Optimization algorithm name (e.g.,spsa,vanillabo)--kernel: GP kernel type used in the BO model (e.g.,matern-1p5,matern-2p5,rbf)--seed: Integer seed used for the experiment (e.g.,1)--epoch,--batch: Epoch and batch indices of the optimization iteration to visualize
-
Ensure a simulation has already been run
This script does not run simulations — it only visualizes existing ones.
Make sure your simulation output exists underoutput/.-
For
single_od_run, the folder format is:output/single_od_run/{date}_{hour}_{eval_measure}_{routes_per_od}_{od_input}/ -
For
full_optimization, the folder format is:output/full_optimization/network_{network_name}_{model_name}_{kernel}_{date}_{hour}_{eval_measure}_{routes_per_od}_seed-{seed}/
-
-
Run the visualization script
Use the following command structure:
python visualization/sumo_gui_runner.py \ --mode ${MODE} \ --network_name ${NETWORK_NAME} \ --date ${DATE} \ --hour ${HOUR} \ --od_input ${OD_INPUT} \ [--eval_measure ${EVAL_MEASURE}] \ [--routes_per_od ${ROUTES_PER_OD}] \ [--model_name ${MODEL_NAME}] \ [--kernel ${KERNEL}] \ [--seed ${SEED}] \ [--epoch ${EPOCH}] \ [--batch ${BATCH}] \ [--overwrite]Example
-
Single OD Run:
python visualization/sumo_gui_runner.py --mode single_od_run --network_name 1ramp --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --od_input od_1ramp_csv
-
Full Optimization:
python visualization/sumo_gui_runner.py --mode full_optimization --network_name 1ramp --model_name vanillabo --kernel matern-2p5 --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --seed 1 --epoch 26 --batch 2
-
- The script expects only one matching folder for the given input arguments. If multiple or no matches are found, it will raise an error.
- The simulation must have been run beforehand so that *_routes.vehroutes.xml exists
Visualize Aggregated Results — Analyze and compare multiple optimization runs
This script allows you to generate convergence plots and visualize the fit to ground truth for multiple optimization results.
--network_name: Name of the network scenario (e.g.,1ramp)--kernel: GP kernel type used in the BO model (e.g.,matern-2p5)--date: Simulation date inyymmddformat (e.g.,221014)--hour: Time window of the experiment (e.g.,08-09)--eval_measure: Evaluation measurements; choose betweencountorspeed.--routes_per_od: Type of routes to use for the simulation; choose betweensingleormultiple.--max_epoch: Maximum epoch number to visualize. Must be an integer. If any result folder does not contain data up to this epoch, an error will occur (e.g., 10, 100)
Use the following command structure:
python visualization/results_visualization.py \
--network_name ${NETWORK_NAME} \
--kernel ${KERNEL} \
--date ${DATE} \
--hour ${HOUR} \
--eval_measure ${EVAL_MEASURE} \
--routes_per_od ${ROUTES_PER_OD} \
--max_epoch ${MAX_EPOCH}Example:
python visualization/results_visualization.py --network_name 1ramp --kernel matern-2p5 --date 221014 --hour 08-09 --eval_measure count --routes_per_od multiple --max_epoch 50- The generated plots are in the
visualization/figuresdirectory. - Executing this code will generate the following figure.
Add New Traffic Network
This repository provides both the network file and the traffic sensor dataset matched to that network. If you would like to apply your own network and sensor data, this guide explains which parts to modify and how to prepare compatible inputs.
We used the San Francisco network from the research dataset here.
From this file, we extracted the San Jose area using netedit.
Each link(edge) in the SUMO .xml file has a unique ID.
-
(1) Download the data
From Caltrans PeMS:- Select “Station 5-Minute” and “District 4”
- Click the files listed under “Available Files” to download
Make a folder named "raw_data" and make folders by month ex) 01, 02, 03, ...
Put the downloaded files in the folders
Run this script to convert 5-minute data into hourly data:
python process.py
-
(2) Check data fields
The dataset should include statistics by network type (e.g., ramp, corridor), date, and hourly time intervals (e.g., total_flow, avg_speed).
After completing Step 3, return to this dataset to filter the sensors you want to use.
The raw sensor data does not include information on which SUMO network link each sensor corresponds to. The goal is to match each sensor ID in the PeMS dataset to the corresponding SUMO network link.
-
(1) For each sensor’s
(lat, lon), find the 10 nearest edges in the SUMO network. -
(2) Among these, identify matches where the following metadata are consistent:
- traffic direction (Dir)
- number of lanes (Lanes)
- highway name (Fwy)
-
(3) Once matched, create a dataset containing
link_id,vehicle_count, andmean_speed.vehicle_countcorresponds to Total Flow, andmean_speedcorresponds to Avg Speed in the PeMS data.
Example:
sensor_data/221014/gt_link_data_1ramp_221014_08-09.csv
If you already have a .xml or .net.xml network file — you can use it directly.
If not, you can generate one from OpenStreetMap (OSM):
-
(1) Export OSM file
Go to OpenStreetMap, zoom into your target area, and Export the map as an.osmfile. -
(2) Convert to SUMO format
Required to download SUMOnetconvert --osm-files export.osm -o my_network.xml
-
(3) You now have a SUMO-compatible network file.
Prepare vehicle count (flow) data that can be matched to your network links.
-
If you already have sensor data that aligns with your SUMO network — you’re good to go.
-
Otherwise, ensure your dataset includes at least: Latitude / Longitude (Optionally) road type, number of lanes, traffic direction
These fields will help accurately map your sensors to the SUMO network edges.
Please make sure your data format follows the same structure as this example file:
- Network example:
network/{my_network}/net.xml - Sensor data example:
sensor_data/{date}/gt_link_data_{my_network}_{date}_{hour}.csv
Add New BO Method
To add a custom BO method, follow the three main steps below:
Choose a unique name (e.g., mybo) for your BO method.
This name will be passed to the --model_name argument when running the benchmark, just like other existing methods such as vanillabo, turbo, or turbo.
To define a new Bayesian Optimization method, you’ll need to implement three components:
-
(1) Create the optimization strategy file
Add a new Python file to the optimizer module:
src/optimizers/mybo.pyThis file must define:
- A Strategy class (e.g., MyBOStrategy) that inherits from BaseStrategy
- An optimize_acqf_and_create_candidate(...) function that generates candidate points
Reference:
vanillabo.py,saasbo.py,turbo.py -
(2) Register GP initialization logic
If your method uses a GP model, you need to define how the GP is initialized by editing:
src/models/gp_models.py- Add a function like
initialize_mybo_model() - Then update the dispatcher function
initialize_model()
Reference:
initialize_vanillabo_model(),initialize_saasbo_model(), andinitialize_turbo_model() - Add a function like
-
(3) Define model-specific hyperparameters
Declare method-specific parameters inside:
src/utils/params.pyInside
get_params(...), add an entry under themodel_specificdictionary:"mybo": lambda: { "bo_batch_size": config["bo_batch_size"], "bo_num_restarts": config["bo_num_restarts"], "bo_raw_samples": config["bo_raw_samples"], ... }
Your method now needs to be registered in the main benchmark pipeline.
-
(1) Register the strategy
Open:
src/optimizers/strategy_registry.pyAdd an import and map your method name to the new strategy class:
from optimizers.mybo import MyBOStrategy strategy_registery = { ... "mybo": MyBOStrategy, } -
(2) Enable command-line interface
Open:
src/full_optimization.pyUpdate the argument parser for
--model_nameto include your new method:parser.add_argument( "--model_name", type=str, default="spsa", choices=["initSearch", "spsa", "vanillabo", "saasbo", "turbo", "mybo"], )This makes your strategy callable via the CLI:
python src/full_optimization.py --model_name mybo ...
Add New GP Kernel
To add a new GP kernel, follow the three main steps below:
Choose a unique name (e.g., mykernel) for new GP kernel.
Open:
src/models/gp_models.py
Open:
src/models/gp_models.py
Add your new kernel in the function set_covar_module**
Reference: See matern-1p5, matern-2p5, rbf in set_covar_module()
Open:
src/full_optimization.py
Update the argument parser for --kernel to include your new kernel:
parser.add_argument(
"--kernel",
type=str,
default="matern-2p5",
choices=["matern-1p5", "matern-2p5", "rbf", "mykernel", "none"])
This makes your kernel callable via the CLI:
python src/full_optimization.py --kernel mykernel ...





