SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset

Goodarz Mehr, Azim Eskandarian
Virginia Commonwealth University

SimBEV.mp4

News

[2026/1/15] SimBEV2X coming soon...

[2026/1/15] SimBEV 3.1 is released, adding support for 3D semantic occupancy ground truth.

[2025/12/12] SimBEV 3.0 is released, with support for new 3D and BEV classes, randomly-generated hazard areas, an interactive visualizer, and more. SimBEV Dataset v2 coming soon...

[2025/8/15] SimBEV 2.0 is released, with support for new 3D and BEV classes, continuous weather shifts, and more.

[2025/4/15] Our implementation of UniTR trained on the SimBEV dataset is released.

[2025/2/9] Our implementation of BEVFusion trained on the SimBEV dataset is released.

[2025/2/6] Initial release of dataset, code, and paper.

About

SimBEV is a configurable and scalable synthetic driving data generation tool based on the CARLA Simulator. It supports a comprehensive array of sensors and incorporates information from various sources to capture accurate bird's-eye view (BEV) and 3D semantic occupancy ground truth alongside 3D object bounding boxes to enable a variety of perception tasks, including BEV segmentation, 3D semantic occupancy prediction, and 3D object detection. SimBEV is used to create the SimBEV dataset, a large collection of annotated perception data from diverse driving scenarios.

A data sample generated by SimBEV. The left half depicts a 360-degree view of the ego (data collection) vehicle's surroundings produced by different camera types (from top to bottom RGB, semantic segmentation, instance segmentation, depth, and optical flow cameras, respectively). On the right half, views of lidar, semantic lidar, radar, and BEV ground truth data are shown from top to bottom, respectively. Some images also contain 3D object bounding boxes.

SimBEV randomizes a variety of simulation parameters to create a diverse set of scenarios. To create a new dataset, SimBEV generates and collects data from consecutive episodes, or scenes. The user configures the desired number of scenes for each map (can be an existing CARLA map or a custom one) for the training, validation, and test sets, a variety of simulation parameters, and the sensors that should be used. The user can add more scenes to an existing SimBEV dataset, replace individual scenes, or replay individual scenes to collect additional data. SimBEV works with any CARLA map, even custom maps created by the user.

SimBEV currently supports five camera types (RGB, semantic segmentation, instance segmentation, depth, and optical flow), lidar, semantic lidar, radar, GNSS, IMU, and a custom voxel detection sensor inspired by Co3SOP. The user has full control over each sensor's characteristics (e.g. camera resolution or number of lidar channels), but the placement of the sensors is fixed for now. In addition to sensor data that can be used as ground truth (e.g. semantic segmentation and depth images, semantic lidar point cloud, etc.), SimBEV currently offers three annotation types: 3D object bounding boxes, BEV ground truth, and HD map information.

SimBEV currently produces 3D object bounding boxes for the following 10 classes: car, truck, bus, motorcycle, bicycle, pedestrian, traffic light, traffic sign, traffic cone, and barrier. For each class, the bounding boxes are categorized as easy, medium, or hard based on detection difficulty. Moreover, SimBEV currently supports the following 14 BEV ground truth classes: road, hazard area, road line, sidewalk, crosswalk, traffic cone, barrier, car, truck, bus, motorcycle, bicycle, rider, pedestrian.

The SimBEV dataset (collected using SimBEV 1.0) is a collection of 320 scenes spread across 11 CARLA maps and contains data from all supported sensors. With each scene lasting 16 seconds at a frame rate of 20 Hz, the SimBEV dataset contains 102,400 annotated frames, 8,315,935 3D object bounding boxes (3,792,499 of which are valid, i.e., not fully occluded and visible to the sensors), and 2,793,491,357 BEV ground truth labels.

Installation

Hardware Requirements

We developed and tested SimBEV on a system with the following specifications:

AMD Ryzen 9 9950X (Any Intel 9th Gen or newer or Ryzen 7/9 3rd Gen or newer will probably work)
96 GB RAM (32 GB is probably enough)
Nvidia GeForce RTX 4090
Ubuntu 22.04

To run SimBEV, your system must satisfy CARLA 0.9.16's minimum system requirements.

CARLA

To run SimBEV, you must use our custom version of CARLA (built from source from this fork of the ue4-dev branch). Please download it from here.

We have not tested SimBEV with the standard version of CARLA 0.9.16 or CARLA 0.10.0 and advise against using them with SimBEV. CARLA 0.9.16 is incompatible with SimBEV, and, while CARLA 0.10.0 offers superior graphics, it lacks some features from the UE4-based CARLA that SimBEV relies on (e.g. customizable weather, large maps, etc.) We will make SimBEV available for CARLA 0.10.* when it reaches feature parity with the UE4-based CARLA.

Some of the enhancements in our version are:

Addition of three new sports cars to CARLA's vehicle library using existing 3D models: sixth generation Ford Mustang, Toyota GR Supra, and Bugatti Chiron. The Ford Mustang is SimBEV's default data collection vehicle.

NewCars.mp4

Addition of lights (headlights, taillights, blinkers, etc.) to those older vehicle models in CARLA's library that lacked them, and redesigning of existing vehicle lights in Blender using a new multi-layer approach that better visualizes modern multi-purpose lights.

Addition of a set of 160 standard paint colors for most vehicle models (apart from a few like the firetruck) to choose from, and fixing paint color randomization issues for a few vehicles (e.g. the bus).
Update to the vehicle dynamics parameters of vehicle models to better match their vehicle's behavior and performance in the real world.
Addition of or updating pedestrian navigation information for CARLA's Town12, Town13, and Town15 maps.
Update to motorcycle and bicycle models to select their driver model randomly, instead of always using the same model.
Addition of lights to buildings in Town12 and fixing issues that prevented full control over building/street lights in Town12 and Town15.
Update to the crosswalk information in the OpenDRIVE map files of Town12, Town13, and Town15.
Improvements to CARLA's Traffic Manager, including enhancements to the lane changing behavior of vehicles on autopilot and their reaction to static props (street barriers, traffic cones, etc.).
Enhancements to the collision mesh of vehicle and pedestrian models that should result in a more realistic depiction of them in point cloud data (see a sample comparison between the old (left) and new (right) models below).

Addition of a custom voxel detection sensor that assigns a semantic class to every occupied voxel within a specified grid around the ego vehicle.

VoxelDetector.webm

Several bug fixes and improvements, some of which have been contributed to the main CARLA repository as well (see e.g. PR #9381, #9421, #9422, #9423, #9427, and #9471).

SimBEV

We recommend using SimBEV with Docker. The base Docker image is Ubuntu 22.04 with CUDA 13.0.2 and Vulkan SDK 1.3.204. If you want to use a different base image, you may have to modify ubuntu2204/x86_64 when fetching keys on line 61 of the Dockerfile, based on your Ubuntu release and system architecture. Ensure that libnvidia-gl and libnvidia-common version numbers on line 65 of the Dockerfile match your Nvidia driver version number.

Install Docker on your system.
Install the Nvidia Container Toolkit. It exposes your Nvidia graphics card to Docker containers.

Clone this repository:

git clone https://github.com/GoodarzMehr/SimBEV.git && cd SimBEV

Build the SimBEV Docker image (this will take several minutes):
```
docker build --no-cache --rm --build-arg ARG -t simbev:develop .
```
The following optional build arguments (ARG) are available:
- USER: username inside each container, set to sb by default.
- CARLA_VERSION: installed CARLA version, set to 0.9.16 by default.

Launch a container:

docker run --runtime=nvidia --privileged --gpus all --network=host -e DISPLAY=$DISPLAY \
-v [path/to/CARLA]:/home/carla \
-v [path/to/SimBEV]:/home/simbev \
-v [path/to/dataset]:/dataset \
--shm-size 32g -it simbev:develop /bin/bash

Use nvidia-smi to ensure your graphics card(s) is (are) visible inside the container. Use vulkaninfo --summary to ensure Vulkan has access to your graphics card(s).

Install CARLA inside the container by running:

pip carla/PythonAPI/carla/dist/carla-0.9.16-cp310-cp310-linux_x86_64.whl

In a separate terminal window, enter the container as the root user by running docker exec -it -u 0 [container name] /bin/bash. Then, run:
```
cd simbev && python setup.py develop
```
Exit the container as the root user but stay inside it as the sb (non-root) user.

If you would like to use SimBEV without Docker, you can install the dependencies using the requirements file and then follow steps 6 and 7 above.

Usage

Creating/Expanding a SimBEV Dataset

In the simbev directory, use the config.yaml file to configure SimBEV's behavior (for a detailed explanation of available parameters see the sample_config.yaml file). Set mode in the config.yaml file to create to create a new SimBEV dataset. If a SimBEV dataset already exists (in the path provided by path), SimBEV compares the number of existing and desired scenes for each map and creates additional ones if necessary. This feature can be used to continue creating a dataset in the event of a crash or expand an already existing one. Now, run

simbev configs/config.yaml [options]

options can be any of the following:

--path: path for saving the dataset (/dataset by default).
--render: visualize captured sensor data.
--save: save captured sensor data (used by default).
--no-save: do not save captured sensor data.

For instance,

simbev configs/config.yaml --render --no-save

visualizes sensor data as it is being captured without saving it.

You can pause/resume the simulation at any time by pressing F9.

Replacing Scenes

If you would like to replace a number of existing scenes, set mode in the config.yaml file to replace and specify the list of scenes that should be replaced using the replacement_scene_config field.

Replaying/Augmenting Scenes

If you would like to replay/augment a number of existing scenes, set mode in the config.yaml file to replay and specify the list of scenes that should be replayed using the replay_scene_config field. SimBEV will use the saved CARLA log file of the specified scenes to replay them. This can be useful if you want to collect additional data from a scene. For example, if you have already collected RGB camera data and would like to collect semantic lidar and radar data when replaying the scene, set use_rgb_camera field in the config.yaml file to False and set use_semantic_lidar and use_radar to True. Just note that because the rider of motorcycles and bicycles is selected at random by UE4 each time, it will be different when replaying the scene, but this is usually a very small discrepancy and everything else in the replayed scene should exactly match the original scene.

Post-processing

An optional post-processing step will calculate the number of lidar and radar points inside each 3D object bounding box (0 for all objects if that data is not collected) alongside a valid flag indicating whether the object is fully occluded (False) or visible to the data collection vehicle (True). By default, an object is valid if the number of points inside its bounding box is non-zero and invalid otherwise. However, if you have collected instance segmentation images you can use the --use-seg argument to use those images to assist in determining the validity of objects (if the number of points inside the object's bounding box is zero but the object is visible in the image, then it is valid). The post-processing step also determines the detection difficulty of an object (either easy, medium, or hard) based on the object's class, distance to the data collection vehicle, and the number of points inside its bounding box. This information will be appended to bounding box data. Finally, if you have collected 3D semantic occupancy data, since in many cases those voxels represent only the surface shell of objects, the post-processing step will fill in the semantic label of voxels inside those objects.

To post-process the data, in the simbev directory run

simbev-postprocess [options]

options can be any of the following:

--path: path for saving the dataset (/dataset by default).
--process-bbox: post-process 3D object bounding boxes (used by default).
--no-process-bbox: do not post-process 3D object bounding boxes.
--use-seg: use instance segmentation images to help with post-processing 3D object bounding boxes.
--fill-voxels: post-process 3D semantic occupancy data.
--morph-kernel-size: kernel size used for morphological closing (3 by default).
--num-gpus: number of GPUs used for post-processing 3D semantic occupancy data (-1, i.e. all available GPUs, by default).

The post-processing step will create a new det folder under ground-truth (see Data Format for more information) and move the files of the original det folder to a new old_det folder.

Data Visualization

To visualize certain types of collected data (those that are not readily visualized, e.g. semantic segmentation images are already in .png format), run

simbev-visualize [mode] [options]

Setting mode to interactive launches SimBEV's interactive visualizer for point cloud (lidar, semantic lidar, radar) and voxel data, allowing the user to evaluate and inspect each scene and frame, as shown below:

SimBEVInteractiveViz.mp4

For all other modes, a new viz folder in the dataset's path is created where the visualizations are stored. Visualizations involving 3D object bounding boxes require data to be post-processed first.

mode can be all, or any combination of the following:

rgb: RGB images with 3D object bounding boxes overlaid.

depth: depth images.

flow: optical flow images.

lidar, lidar-with-bbox: top-down view of lidar point clouds, without and with 3D object bounding boxes overlaid, respectively.

lidar3d, lidar3d-with-bbox: 3D view of lidar point clouds, without and with 3D object bounding boxes overlaid, respectively.

semantic-lidar, semantic-lidar3D: top-down and 3D view of semantic lidar point clouds, respectively.

radar, radar-with-bbox: top-down view of radar point clouds, without and with 3D object bounding boxes overlaid, respectively.

radar3d, radar3d-with-bbox: 3D view of radar point clouds, without and with 3D object bounding boxes overlaid, respectively.

Visualization modes involving point clouds have two default views, NEAR and FAR, as defined in the visualization_handlers file, where you can also define your custom view if needed.

options can be any of the following:

--path: path to the dataset (/dataset by default).
-s, --scene: list of scene numbers to visualize, can be individual numbers or a range (-1, i.e. all scenes, by default).
-f, --frame: list of frame numbers to visualize, can be individual numbers or a range (-1, i.e. all frames, by default).
--ignore-valid-flag: display all 3D bounding boxes regardless of the value of their valid flag.

For instance, using

simbev-visualize rgb depth lidar3d semantic-lidar radar-with-bbox --scene 0 12 27-32 --frame 3 30-49 300

visualizes RGB images with 3D bounding boxes overlaid, depth images, lidar point clouds from a 3D perspective view, semantic lidar point clouds from a top-down view, and radar point clouds from a top-down view with 3D bounding boxes overlaid for frames 3, 30 to 49, and 300 of scenes 0, 12, and 27 to 32.

Using the SimBEV Dataset

Consult our implementations of BEVFusion and UniTR for how to use the SimBEV dataset.

Data Format

Sensor Setup

The placement and coordinate system of the sensors are shown on the left and tabulated on the right. Coordinate values are relative to a FLU (Front-Left-Up) coordinate system positioned at the center of the ground plane of the vehicle's 3D bounding box.

Properties of sensors used to collect the SimBEV dataset (top) and their FoV (bottom).

Sensors in SimBEV are referenced using the {subtype}-{position} format (which turns into {position} when subtype is not available). For cameras, subtype can be one of RGB (RGB camera), SEG (semantic segmentation camera), IST (instance segmentation camera), DPT (depth camera), or FLW (optical flow camera), while position can be one of CAM_FRONT_LEFT, CAM_FRONT, CAM_FRONT_RIGHT, CAM_BACK_RIGHT, CAM_BACK, CAM_BACK_LEFT. For instance, DPT-CAM_BACK_LEFT denotes the back left depth camera. For lidar, since there is only one position, regular lidar is denoted by LIDAR while semantic lidar is denoted by SEG-LIDAR. For radar, subtype is not available and position can be one of RAD_LEFT, RAD_FRONT, RAD_RIGHT, RAD_BACK. GNSS and IMU are simply denoted as GNSS and IMU, respectively. The voxel detector is denoted as VOXEL-GRID, and the post-processed 3D semantic occupancy data is denoted as VOXEL-GRID-FILLED.

Folder Structure

A generic SimBEV dataset uses the following folder structure.

simbev/
|
├── configs/
|
├── console_logs/
|
├── ground-truth/
|   ├── det/
|   ├── old_det/ (if 3D object bounding boxes are post-processed)
|   ├── seg/
|   ├── seg_viz/
|   ├── hd_map/
|
├── infos
|   ├── simbev_infos_train.json
|   ├── simbev_infos_val.json
|   ├── simbev_infos_test.json
|
├── logs/
|
├── sweeps/
|   ├── RGB-CAM_FRONT_LEFT/
|   ├── RGB-CAM_FRONT/
|   ├── RGB-CAM_FRONT_RIGHT/
|   ├── RGB-CAM_BACK_LEFT/
|   ├── RGB-CAM_BACK/
|   ├── RGB-CAM_BACK_RIGHT/
|   ├── SEG-CAM_FRONT_LEFT/
|   ├── SEG-CAM_FRONT/
|   ├── SEG-CAM_FRONT_RIGHT/
|   ├── SEG-CAM_BACK_LEFT/
|   ├── SEG-CAM_BACK/
|   ├── SEG-CAM_BACK_RIGHT/
|   ├── IST-CAM_FRONT_LEFT/
|   ├── IST-CAM_FRONT/
|   ├── IST-CAM_FRONT_RIGHT/
|   ├── IST-CAM_BACK_LEFT/
|   ├── IST-CAM_BACK/
|   ├── IST-CAM_BACK_RIGHT/
|   ├── DPT-CAM_FRONT_LEFT/
|   ├── DPT-CAM_FRONT/
|   ├── DPT-CAM_FRONT_RIGHT/
|   ├── DPT-CAM_BACK_LEFT/
|   ├── DPT-CAM_BACK/
|   ├── DPT-CAM_BACK_RIGHT/
|   ├── FLW-CAM_FRONT_LEFT/
|   ├── FLW-CAM_FRONT/
|   ├── FLW-CAM_FRONT_RIGHT/
|   ├── FLW-CAM_BACK_LEFT/
|   ├── FLW-CAM_BACK/
|   ├── FLW-CAM_BACK_RIGHT/
|   ├── LIDAR/
|   ├── SEG-LIDAR/
|   ├── RAD_LEFT/
|   ├── RAD_FRONT/
|   ├── RAD_RIGHT/
|   ├── RAD_BACK/
|   ├── GNSS/
|   ├── IMU/
|   ├── VOXEL-GRID/
|   ├── VOXEL-GRID-FILLED/ (if semantic occupancy data is post-processed)
|
├── viz/ (if data is visualized)

configs

Contains the config file for each scene, with the files using the SimBEV-scene-{scene number}.yaml naming scheme. The files are usually identical, unless the dataset was expanded or some scenes were replaced or augmented using a different configuration. If an existing scene is augmented, the new config file will use the SimBEV-scene-{scene number}-augment-{i}.yaml naming scheme, where i is the index of the attempt at augmentation (i.e. i is 0 for the first augmentation attempt, 1 for the second attempt, etc.).

console_logs

Contains the logging output to the console/terminal.

ground-truth

Contains the ground truth files for each frame, with the files using the SimBEV-scene-{scene number}-frame-{frame number}-{type}.{data type} naming scheme. For the det, seg, seg_viz, and hd_map folders, type and data type are GT_DET and bin; GT_SEG and npz; GT_SEG_VIZ and jpg; and HD_MAP and json, respectively.

The det folder contains the 3D object ground truth files for each frame. In each file, the following information is provided for each object:

id: object ID supplied by CARLA
type: object type, e.g. vehicle.ford.mustang_2016 or walker.pedestrian.0051
is_alive: True if the object is alive, False if destroyed
is_active: True if the object is active, False otherwise
is_dormant: True if the object is dormant, False otherwise
parent: ID of the parent object if one exists, None otherwise
attributes: object attributes, e.g. has_lights, color, role_name, etc. for a car
semantic_tags: object semantic tags
bounding_box: global coordinates of the corners of the object's 3D bounding box
location: location ($x$, $y$, $z$) of the object (in a right-handed coordinate frame)
rotation: rotation (roll, pitch, yaw) of the object (in a right-handed coordinate frame)
linear_velocity: linear velocity of the object (m/s)
angular_velocity: angular velocity of the object (deg/s)
distance_to_ego: distance of the object from the data collection vehicle (m)
angle_to_ego: angle of the object to the data collection vehicle (deg, vehicle's front vector is 0, positive CCW)
[requires post processing] num_lidar_pts: number of lidar points inside the object's 3D bounding box
[requires post processing] num_radar_pts: number of radar points inside the object's 3D bounding box
[requires post processing] valid_flag: True if the object is visible to the data collection vehicle, False otherwise
[requires post processing] class: class of the object
[requires post processing] difficulty: detection difficulty of the object, can be easy, medium, or hard
[traffic light only] green_time: duration the traffic light stays green (s)
[traffic light only] yellow_time: duration the traffic light stays yellow (s)
[traffic light only] red_time: duration the traffic light stays red (s)
[traffic light only] state: current state of the traffic light (i.e. green, yellow, or red)
[traffic light only] opendrive_id: OpenDRIVE ID of the traffic light
[traffic light only] pole_index: index of the traffic light's pole whitin the traffic light group
[traffic sign only] sign_type: traffic sign's type, if it can be extracted from CARLA; generally stop, yield, or speed_limit; in Town12, Town13, and Town15 the speed limit is provided as well, e.g. speed_limit_30 (30 km/h speed limit) or speed_limit_55_min_40 (55 km/h speed limit, 40 km/h minimum speed limit)

The seg folder contains the BEV ground truth files for each frame. BEV ground truth is a binary $C \times d \times d$ array, where $C$ is the number of classes and $d$ is the dimension of the BEV grid (360 for the SimBEV dataset). The BEV ground truth contains 14 classes, which in order are road, hazard, road_line, sidewalk, crosswalk, traffic_cone, barrier, car, truck, bus, motorcycle, bicycle, rider, pedestrian. The second and third dimensions of the array increase along the $-x$ and $-y$ axes of the vehicle's FLU coordinate system, respectively.

The seg_viz folder contains the visualization of the BEV ground truth for each frame.

The hd_map folder contains information about the waypoint at the ego vehicle's location for each frame, which, when combined wih the CARLA map's OpenDRIVE file data should provide accurate map information about the area around the ego vehicle. The following information is provided for each waypoint:

id: waypoint ID supplied by CARLA
s: distance along the road section
road_id: OpenDRIVE ID of the road the waypoint belongs to
section_id: OpenDRIVE ID of the road section the waypoint belongs to
lane_id: OpenDRIVE ID of the lane the waypoint belongs to
lane_type: type of the lane the waypoint belongs to, should be Driving but other possible values include Sidewalk, Shoulder, Curb, etc.
lane_width: width of the lane the waypoint belongs to
lane_change: type of lane change permitted by the lane
is_junction: whether the waypoint is in a junction
junction_id: OpenDRIVE ID of the junction if the waypoint is in a junction
is_intersection: whether the waypoint is in an intersection
transform: global coordinate transform (location, rotation) of the waypoint
left/right_lane_marking: information about the left/right lane markings, includes type (e.g. Solid, Broken, SolidBroken, etc.), width, color, and lane_change
left/right_lane: information about the corresponding waypoint in the left/right lane, includes id, s, road_id, section_id, lane_id, lane_type, lane_width, and lane_change

infos

Contains the info files for each data split, with the files using the simbev_infos_{split}.json naming scheme where split is either train, val, or test. Each file is comprised of metadata and data. metadata contains coordinate transformation matrices for all sensors (i.e. sensor2lidar_translation, sensor2lidar_rotation, sensor2ego_translation, and sensor2ego_rotation), as well as the camera intrinsics matrix. data contains scene information, divided into scene_info and scene_data for each scene. scene_info includes the overall scene information, while scene_data provides information about individual frames, including file paths for collected sensor data and the corresponding ground truth.

logs

Contains the log file for each scene, with the files using the SimBEV-scene-{scene number}.log naming scheme. Log files can be used by SimBEV to replay scenes and collect additional data.

sweeps

Contains collected sensor data for each frame, with the files using the {sensor}/SimBEV-scene-{scene number}-frame-{frame number}-{sensor}.{type} naming scheme. For instance, back left RGB camera image for frame 12 of scene 27 is saved as RGB-CAM_BACK_LEFT/SimBEV-scene-0027-frame-0012-RGB-CAM_BACK_LEFT.jpg. We briefly discuss how each sensor's data is saved below. See CARLA's sensors documentation for more details.

RGB camera: images are saved as .jpg files.
Semantic segmentation camera: images are saved as .png files.
Instance segmentation camera: images are saved as .png files.
Depth camera: images are saved as .png files.
Optical flow camera: images are saved as a $(h, w, 2)$ NumPy array where $h$ and $w$ are the image height and width, respectively.
Lidar: point clouds are saved as a $(n, 3)$ NumPy array where the columns represent the $x$, $y$, and $z$ values, respectively.
Semantic lidar: point clouds are saved as a $(n, 6)$ NumPy array where the columns represent the $x$, $y$, and $z$ values, cosine of the incidence angle, and the index and semantic tag of the hit object, respectively.
Radar: point clouds are saved as a $(n, 4)$ NumPy array where the columns represent the depth, altitude angle, azimuth angle, and velocity, respectively.
GNSS: data is saved as a [latitude, longitude, altitude] Numpy array.
IMU: data is saved as a [ $\dot{x}$, $\dot{y}$, $\dot{z}$, $\dot{\phi}$, $\dot{\theta}$, $\dot{\psi}$, $\psi$] NumPy array.
Voxel detector: data is saved as a $(d, w, h)$ NumPy array where the dimensions represent the $x$, $y$, and $z$ directions of the vehicle's FLU coordinate system, respectivelly. Each cell contains the semantic (class) label of the object that overlaps with that cell, unless the cell is unoccupied in which case its value is 0.

SimBEV Dataset Benchmarks

Models are trained on the SimBEV dataset's train set and evaluated on its test set with the hyperparameters their authors used for the nuScenes dataset.

3D Object Detection

Model	Modality	mAP (%)	mATE (m)	mAOE (rad)	mASE	mAVE (m/s)	SDS (%)
BEVFusion-C	C	22.1	0.744	1.04	0.137	4.65	25.1	Checkpoint
BEVFusion-L	L	48.1	0.144	0.133	0.134	1.56	56.4	Checkpoint
BEVFusion	C+L	48.1	0.146	0.122	0.127	1.54	56.6	Checkpoint
UniTR	C+L	47.7	0.113	0.224	0.090	0.55	61.7	Checkpoint
UniTR+LSS	C+L	47.8	0.113	0.207	0.085	0.53	62.2	Checkpoint

BEV Segmentation

Model	Modality	Road	Car	Truck	Bus	Motorcycle	Bicycle	Rider	Pedestrian	mIoU
BEVFusion-C	C	76.0	17.2	5.1	22.9	0.0	0.0	0.0	0.0	15.2	Checkpoint
BEVFusion-L	L	87.7	70.6	73.5	81.5	32.5	3.6	18.4	18.9	48.3	Checkpoint
BEVFusion	C+L	88.4	72.7	74.5	80.0	36.3	3.6	23.3	20.0	50.0	Checkpoint
UniTR	C+L	92.8	73.8	67.7	51.7	36.5	11.4	36.2	27.5	49.7	Checkpoint
UniTR+LSS	C+L	93.3	72.8	69.4	58.5	35.9	6.3	31.6	12.9	47.6	Checkpoint

Acknowledgement

SimBEV is based on CARLA and we are grateful to the team that maintains it. SimBEV has also taken inspiration from the nuScenes, SHIFT, OPV2V, and V2X-Sim datasets, as well as Co3SOP.

The sixth generation Ford Mustang model is based on this BlenderKit model by Kentik Khudosovtsev.

Hazard area static props are based on this Roadside Construction asset by Quixel Megascans.

Citation

If SimBEV is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{mehr2025simbev,
  title={SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset},
  author={Mehr, Goodarz and Eskandarian, Azim},
  journal={arXiv preprint arXiv:2502.01894},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
assets		assets
configs		configs
simbev		simbev
simbev_tools		simbev_tools
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

License

GoodarzMehr/SimBEV

Folders and files

Latest commit

History

Repository files navigation

SimBEV: A Synthetic Multi-Task Multi-Sensor Driving Data Generation Tool and Dataset

Table of Contents

News

About

Installation

Hardware Requirements

CARLA

SimBEV

Usage

Creating/Expanding a SimBEV Dataset

Replacing Scenes

Replaying/Augmenting Scenes

Post-processing

Data Visualization

Using the SimBEV Dataset

Data Format

Sensor Setup

The placement and coordinate system of the sensors are shown on the left and tabulated on the right. Coordinate values are relative to a FLU (Front-Left-Up) coordinate system positioned at the center of the ground plane of the vehicle's 3D bounding box.

Properties of sensors used to collect the SimBEV dataset (top) and their FoV (bottom).

Folder Structure

configs

console_logs

ground-truth

infos

logs

sweeps

SimBEV Dataset Benchmarks

3D Object Detection

BEV Segmentation

Acknowledgement

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Uh oh!

Languages