ROMS-Tools: Reproducible Preprocessing and Analysis for Regional Ocean Modeling with ROMS#

ROMS-Tools is a Python package for preparing and analyzing ROMS simulations, with optional MARBL biogeochemistry (BGC) support.

ROMS-Tools enables users to generate regional grids, prepare model inputs, and analyze outputs through a modern, user-friendly interface that standardizes common workflows and reduces data-preparation overhead. The package is designed for reproducible research, with YAML-based configuration, optional dask parallelization, interactive Jupyter support, and CI-tested reliability with comprehensive documentation.

Current capabilities are fully compatible with UCLA-ROMS [Molemaker and contributors, 2025, Molemaker and contributors, 2025], with potential support for other ROMS implementations, such as Rutgers ROMS [Arango and contributors, 2024], in the future.

Overview of ROMS-Tools Functionality#

ROMS-Tools provides a comprehensive workflow for generating, processing, and analyzing ROMS-MARBL model inputs and outputs, as detailed below.

Input Data and Preprocessing#

Built on xarray and optionally powered by dask, ROMS-Tools automates the generation of all major ROMS–MARBL inputs, including:

  1. Model Grid: Customizable, curvilinear, and orthogonal grid designed to maintain a nearly uniform horizontal resolution across the domain. The grid is rotatable to align with coastlines and features a terrain-following vertical coordinate.

  2. Bathymetry: Derived from SRTM15 [Tozer et al., 2019].

  3. Land Mask: Inferred from coastlines provided by Natural Earth or the Global Self-consistent, Hierarchical, High-resolution Geography (GSHHG) Database [Wessel and Smith, 1996].

  4. Physical Ocean Conditions: Initial and open boundary conditions for sea surface height, temperature, salinity, and velocities derived from the 1/12° Global Ocean Physics Reanalysis (GLORYS) [Lellouche et al., 2021].

  5. BGC Ocean Conditions: Initial and open boundary conditions for dissolved inorganic carbon, alkalinity, and other biogeochemical tracers from Community Earth System Model (CESM) output [Yeager et al., 2022] or hybrid observational-model sources [Garcia et al., 2019, Huang et al., 2022, Lauvset et al., 2016, Yang et al., 2020, Yeager et al., 2022].

  6. Meteorological forcing: Wind, radiation, precipitation, and air temperature/humidity processed from the global 1/4° ECMWF Reanalysis v5 (ERA5) [Hersbach et al., 2020] with optional corrections for radiation bias and coastal wind.

  7. BGC surface forcing: Partial pressure of carbon dioxide, as well as iron, dust, and nitrogen deposition from CESM output [Yeager et al., 2022] or hybrid observational-model sources [Hamilton et al., 2022, Kok et al., 2021, Landschützer et al., 2016, Yeager et al., 2022].

  8. Tidal Forcing: Tidal potential, elevation, and velocities derived from TPXO [Egbert and Erofeeva, 2002] including self-attraction and loading (SAL) corrections.

  9. River Forcing: Freshwater runoff derived from Dai & Trenberth [Dai and Trenberth, 2002] or user-provided custom files.

  10. CDR Forcing: User-defined interventions that inject BGC tracers at point sources or as larger-scale Gaussian perturbations to simulate CDR interventions. The CDR forcing is prescribed as volume and tracer fluxes (e.g., alkalinity for ocean alkalinity enhancement, iron for iron fertilization, or other BGC constituents). Users can control the magnitude, spatial footprint, and temporal evolution, allowing flexible representation of CDR interventions.

  11. Nesting: Support for creating nested grids and parent-child configurations.

Some source datasets are accessed automatically by ROMS-Tools, including Natural Earth, Dai & Trenberth runoff, and ERA5 meteorology, while users must manually download SRTM15, GSHHG, GLORYS, the BGC datasets, and TPXO tidal files. Although these are the datasets currently supported, the modular design of ROMS-Tools makes it straightforward to add new source datasets in the future.

To generate the model inputs, ROMS-Tools automates several intermediate processing steps, including:

  • Bathymetry processing: The bathymetry is smoothed in two stages, first across the entire model domain and then locally in areas with steep slopes, to ensure local steepness ratios do not exceed a prescribed threshold in order to reduce pressure-gradient errors. A minimum depth is enforced to prevent water levels from becoming negative during large tidal excursions.

  • Mask definition: The land-sea mask is generated by comparing the ROMS grid’s horizontal coordinates with a coastline dataset using the regionmask package [Hauser et al., 2024]. Enclosed basins are subsequently filled with land.

  • Land value handling: Land values are filled via an algebraic multigrid method using pyamg [Bell et al., 2023] prior to horizontal regridding. This extends ocean values into land areas to reconcile discrepancies between source data and ROMS land masks, ensuring that no NaNs or land-originating values contaminate ocean grid cells.

  • Regridding: Ocean and atmospheric fields are horizontally and vertically regridded from standard latitude-longitude-depth grids to the model’s curvilinear grid with a terrain-following vertical coordinate using xarray [Hoyer and Hamman, 2017] and xgcm [Busecke and contributors, 2025]. Velocities are rotated to align with the curvilinear ROMS grid.

  • Longitude conventions: ROMS-Tools handles differences in longitude conventions, converting between [-180°, 180°] and [0°, 360°] as needed.

  • River locations: Rivers that fall within the model domain are automatically identified and relocated to the nearest coastal grid cell. Rivers that need to be shifted manually or span multiple cells can be configured by the user.

  • Data streaming: ERA5 atmospheric data can be accessed directly from the cloud, removing the need for users to pre-download large datasets locally. Similar streaming capabilities may be implemented for other datasets in the future.

Users can quickly design and visualize regional grids and inspect all input fields with built-in plotting utilities.

Postprocessing and Analysis#

ROMS-Tools supports postprocessing and analysis of ROMS-MARBL output, including regridding from the native curvilinear, terrain-following grid to a standard latitude-longitude-depth grid using xesmf [Zhuang et al., 2023], with built-in plotting for both grid types. The analysis layer also includes specialized utilities for evaluating carbon dioxide removal (CDR) interventions, such as generating carbon uptake and efficiency curves.

Getting Started#

Preparing a ROMS Simulation#

The following examples cover task-specific workflows for the Perlmutter and Anvil supercomputers. Pre-download data and adjust paths as needed. For a full end-to-end workflow designed to run on a laptop, refer to the “End-to-end workflow (laptop)” section above.

Analyzing a ROMS Simulation#

The following examples cover task-specific workflows for the Perlmutter and Anvil supercomputers. Pre-download data and adjust paths as needed. For a full end-to-end workflow designed to run on a laptop, refer to the “End-to-end workflow (laptop)” section above.

Advanced Topics#

For Developers#

References#