Releases: NVIDIA/physicsnemo
v2.1.1
v2.1.0
PhysicsNeMo General Release v2.1.0
Added
- Adds GLOBE model (
physicsnemo.experimental.models.globe.model.GLOBE),
including new variant that uses a dual tree traversal algorithm to reduce the
complexity of the kernel evaluations from O(N^2) to O(N). - Adds GLOBE AirFRANS example case (
examples/cfd/external_aerodynamics/globe/airfrans) - Adds GLOBE DrivAerML example case (
examples/cfd/external_aerodynamics/globe/drivaer) - Adds drop-test dynamics recipe.
- Adds concrete dropout uncertainty quantification for GeoTransolver. Learnable
per-layer dropout rates enable MC-Dropout inference for uncertainty
estimates. Disabled by default (concrete_dropout: false). - Adds automatic support for
FSDPand/orShardTensormodels in checkpoint save/load
functionality - PhysicsNeMo-Mesh now supports conversion from PyVista/VTK/VTU meshes that may
contain polyhedral cells. - In PhysicsNeMo-Mesh, adds
Mesh.to_point_cloud(),.to_edge_graph(), and
.to_dual_graph()methods. These allow Mesh conversion to 0D point clouds, 1D
edge graphs, and 1D dual graphs, respectively, when connectivity information
is not needed. - Adds
physicsnemo.mesh.generatesubpackage withmarching_cubesfor
isosurface extraction from 3D scalar fields, returning aMeshobject.
Supports the NVIDIA Warp backend. - Adds a type system to PhysicsNeMo-Mesh, allowing annotation of Mesh dimensions
using notation likeMesh[2, 3]for a 2D manifold in 3D space. - Adds adjacency caching to PhysicsNeMo-Mesh
Meshobjects, allowing efficient
reuse of neighbor information. - Adds
DomainMeshclass for grouping an interior mesh with named boundary
meshes and domain-level metadata, with passthrough geometric transforms
(translate, rotate, scale, transform) and data operations. - Allows selective per-field transformation of
Meshobjects:transform_point_data,
transform_cell_data, andtransform_global_datanow acceptbool | TensorDict
(or plaindictfor convenience). - Adds
physicsnemo.mesh.remeshingsubpackage withpartition_cells()for
creating Voronoi regions around seed points. BVH-accelerated. - Added support for 1D, 2D, and 3D neighborhood attention (natten) via
physicsnemo.nn.functionalinterface, with fullShardTensorsupport. - Added derivative functionals in
physicsnemo.nn.functionalfor
uniform_grid_gradient,rectilinear_grid_gradient,
spectral_grid_gradient,meshless_fd_derivatives,mesh_lsq_gradient,
andmesh_green_gauss_gradient. - Adds
physicsnemo.symmodule for symbolic PDE residual computation
(PhysicsInformer). Users define PDEs via SymPy and select a gradient method
(autodiff,finite_difference,spectral,meshless_finite_difference,
least_squares); spatial derivatives are computed automatically using the
nn.functional.derivativesfunctionals. - Ports all physics-informed examples (LDC PINNs, Darcy, Stokes MGN, DoMINO,
datacenter, xaeronet, MHD/SWE PINO) to the newphysicsnemo.syminterface,
replacing the separatephysicsnemo-sympackage dependency. Geometry is now
handled viaphysicsnemo.meshand PyVista. - Added geometry functionals in
physicsnemo.nn.functionalfor
mesh_poisson_disk_sample,mesh_to_voxel_fraction, and
signed_distance_field. - Adds embedded OOD guardrail
OODGuardat
physicsnemo.experimental.guardrails.embedded, optionally
wired intoGeoTransolvervia a newguard_configconstructor argument.
The guard calibrates per-channel global bounds and a geometry-latent
kNN threshold during training, and emits warnings on out-of-distribution
inputs at inference. - In PhysicsNeMo-Mesh,
physicsnemo.mesh.geometrynow publicly exposes
stable_angle_between_vectorsandcompute_triangle_angles(previously
only available via the privatephysicsnemo.mesh.curvature._utils). - PhysicsNeMo Datapipes enables reproducability through
torch.generator
utilities. - PhysicsNeMo Datapipes now supports
physicsnemo.mesh.Meshand
physicsnemo.mesh.DomainMeshobjects for deserialization, with
transformations and utilities for mesh-based datasets. - PhysicsNeMo Datapipes now support
MultiDatasetconstruction,
allowing on-the-fly construction of multi-source composite datasets
that can be sampled and processed efficiently and coherently
as one dataset. - PhysicsNeMo Datapipes also support random augmentations for
mesh-based datapipes, leveragingtorch.distributionsfor
broad random distribution support. Mesh and DomainMesh
datasets allow random translation, scaling, and rotation
of mesh data in coherent ways, compatible with reproducability
features of physicsnemo datapipes. - Adds a new unified training recipe for external aerodynamics
that supports training on multiple datasets (DrivaerML, ShiftSUV,
HighLiftAeroML, or more, bring your own, mix and match), supports
training several different models (Domino, Transolver, GeoTransolver,
Flare, GeoTransolver with Flare-attention, bring your own!). Leverages
mesh datasets and non-dimensionalization to enable dataset mixing and
matching at runtime. Train with surface or volume data. - Adds a new
physicsnemo.diffusion.multi_diffusionsubpackage that
scales 2D diffusion models to large domains via patch-based training
and inference. ProvidesMultiDiffusionModel2D(wraps a base model and
handles state patching, conditioning preprocessing, positional-embedding
injection, and per-patch output fusion), the
MultiDiffusionMSEDSMLoss/MultiDiffusionWeightedMSEDSMLosslosses
for patch-based DSM training, andMultiDiffusionPredictorfor
sampling (plugs straight intosample()/get_denoiser()and the
standard solvers). Patching primitives (BasePatching2D,
GridPatching2D,RandomPatching2D) are exposed under the same
subpackage and aretorch.compile-friendly withfullgraph=True.
MultiDiffusionPredictorsupports memory-efficient inference on
large domains viachunk_sizeanduse_checkpointing. The
subpackage also ships patch-local DPS guidance:
MultiDiffusionDPSScorePredictor(drop-in score predictor that plugs
into the standard sampling stack),
MultiDiffusionDataConsistencyDPSGuidancefor inpainting and sparse
data assimilation, andMultiDiffusionModelConsistencyDPSGuidancefor
generic patch-local observation operators. Use these instead of the
globalDPSScorePredictorto run guided sampling on domains that
would otherwise OOM. - Adds
"epsilon"as a supported prediction type throughout the diffusion
framework, alongside the existing"x0"and"score"modes. A new
PredictorType = Literal["x0", "score", "epsilon"]alias in
physicsnemo.diffusion.baseis wired through losses (MSEDSMLoss,
WeightedMSEDSMLoss, and the multi-diffusion losses), preconditioners,
samplers / solvers, DPS guidance, and noise schedulers, enabling
end-to-end training and sampling of epsilon-parameterized models.
Losses gain anepsilon_to_x0_fnkwarg used for the epsilon-to-x0
conversion required during DSM training. - Adds
DiffusionUNet3D3D U-Net diffusion backbone for volumetric data at
physicsnemo.experimental.models.diffusion_unets. Implements the
DiffusionModelprotocol. Exposes reusable 3D building blocks
(Conv3D,GroupNorm3D,UNetAttention3D,UNetBlock3D) at
physicsnemo.experimental.nn. - Added support for Batched radius search, which enables Domino
and GeoTransolver with local features and batch size > 1. - Added the underfill recipe.
Changed
- Improved crash recipe with configurable stats directory.
physicsnemo.mesh.sampling.find_nearest_cellsuses a KNN-backed
implementation, and no longer accepts thebvh=,chunk_size=,
max_rounds=, ormax_candidates_per_point=parameters.⚠️ BC-impact (deep imports): internalphysicsnemo.nn.functional
modules were reorganized by category. Public top-level functional imports are
unchanged, but code importing internal module paths directly (for example
physicsnemo.nn.functional.knnor
physicsnemo.nn.functional.radius_search) should migrate to
physicsnemo.nn.functional.neighbors.*.- Consolidated Warp interpolation kernels for grid-to-point and point-to-grid
backends, and added missing kernel/helper docstrings. - In PhysicsNeMo-Mesh, dual-mesh primitives gained closed-form fast paths
for triangle meshes embedded in 3D.compute_circumcentersis up to
~10000x faster (e.g. 11 s -> ~1 ms on a 360 K-triangle AirFRANS mesh,
RTX 4090) by replacing batchedtorch.linalg.lstsqover (2, 3) systems
with a closed-form cross product, andcompute_vertex_anglesis up to
~15x faster on the same meshes by replacing the dimension-agnostic
Gram-determinant formula with anatan2(||cross||, dot)formulation.
Anything that depends on these (Gaussian curvature, FEM Laplacian,
cotangent weights, Voronoi areas, smoothing) inherits the speedup. See
perf.mdfor the full audit. - In PhysicsNeMo-Mesh, BVH construction is faster on GPU.
_compute_morton_codeshas a CUDA-specific fused-bits path that
eliminates then_bitssequential kernel launches of the previous
bit-loop (5-8x speedup on small / medium meshes), andBVH.from_mesh
reuses the cachedMesh.cell_centroidsinstead of recomputing.
End-to-endBVH.from_meshis ~2x faster on a 162 K-tetcube_volume
mesh. - In PhysicsNeMo-Mesh, the topology-dedup APIs
(categorize_facets_by_count,find_edges_in_reference,
remove_duplicate_cells,build_adjacency_from_pairs) gained optional
index_bound/n_targetsparameters. When the caller passes a strict
upper bound (typicallymesh.n_pointsormesh.n_cells), the implicit
tensor.max().item()GPU sync is avoided and the...
v2.0.0
PhysicsNeMo General Release v2.0.0
📝 NVIDIA PhysicsNeMo v2.0 contains significant reorganization of all the features, with easier installation and integration to external packages. See the migration guide for more details!
Added
- Refactored diffusion preconditioners in
physicsnemo.diffusion.preconditionersrelying on a new abstract base class
BaseAffinePreconditionerfor preconditioning schemes using affine
transformations. Existing preconditioners (VPPrecond,VEPrecond,
iDDPMPrecond,EDMPrecond) reimplemented based on this new interface. - New
physicsnemo.experimental.nn.symmetrymodule that implements building
blocks that preserve 2D and 3D rotational equivariance using a
grid-based layout for efficient GPU parallelization, and an emphasis on
compacteinsumoperations.
Changed
- PhysicsNemo v2.0 contains significant reorganization of tools. Please see
the v2.0-MIGRATION-GUIDE.md to understand what has changed and why. - DiT (Diffusion Transformer) has been moved from
physicsnemo.experimental.models.dit
tophysicsnemo.models.dit.
Fixed
- Shape mistmatch bug in the Lennard Jones example
Dependencies
- CUDA backend is now selected via orthogonal
cu12/cu13extras rather
than being hardcoded to CUDA 13. Feature extras (nn-extras,utils-extras,
etc.) are now CUDA-agnostic and can be combined with either backend, e.g.
pip install "nvidia-physicsnemo[cu13,nn-extras]". When neithercu12nor
cu13is specified, PyTorch is installed from PyPI using its default build
(currently CUDA 12.8 on Linux). For development withuv, use
uv sync --extra cu13(or--extra cu12) to select the backend.
Contributors
We’re grateful to everyone who contributed issues, feature ideas, fixes, and documentation updates — your input is what helps us continuously improve PhysicsNeMo for the whole community!
A special shout-out to the authors of the pull requests listed above, in no particular order:
@jleinonen @dran-dev @aayushg55 @saikrishnanc-nv @jeis4wpi @albertocarpentieri @paveltomin @weilr @giprayogo @tonishi-nv @younes-abid @dakhare-creator @Alexey-Kamenev
Thank you ❤️ — we truly appreciate your contributions and hope to see more from you in the future!
v1.3.0
PhysicsNeMo General Release v1.3.0
Added
- Added mixture_of_experts for weather example in physicsnemo.examples.weather.
⚠️ Warning: - It uses experimental DiT model subject to future API changes.
Added some modifications to DiT architecture in physicsnemo.experimental.models.dit.
Added learnable option to PositionalEmbedding in physicsnemo.models.diffusion.layers. - Added lead-time aware training support to the StormCast example.
- Add a device aware kNN method to physicsnemo.utils.neighbors. Works with CPU or GPU
by dispatching to the proper optimized library, and torch.compile compatible. - Added additional testing of the DoMINO datapipe.
- Examples: added a new example for full-waveform inversion using diffusion
models. Accessible inexamples/geophysics/diffusion_fwi. - Domain Parallelism: Domain Parallelism is now available for kNN, radius_search,
and torch.nn.functional.pad. - Unified recipe for crash modeling, supporting Transolver and MeshGraphNet,
and three transient schemes. - Added a check to
stochastic_samplerthat helps handle theEDMPrecondmodel,
which has a specific.forward()signature - Added abstract interfaces for constructing active learning workflows, contained
under thephysicsnemo.active_learningnamespace. A preliminary example of how
to compose and define an active learning workflow is provided inexamples/active_learning.
Themoonsexample provides a minimal (pedagogical) composition that is meant to
illustrate how to define the necessary parts of the workflow.
Changed
- Migrated Stokes MGN example to PyTorch Geometric.
- Migrated Lennard Jones example to PyTorch Geometric.
- Migrated physicsnemo.utils.sdf.signed_distance_field to a static return,
torch-only interface. It also now works on distributed meshes and input fields. - Refactored DiTBlock to be more modular
- Added NATTEN 2D neighborhood attention backend for DiTBlock
- Migrated blood flow example to PyTorch Geometric.
- Refactored DoMINO model code and examples for performance optimizations and improved readability.
- Migrated HydroGraphNet example to PyTorch Geometric.
- Support for saving and loading nested
physicsnemo.Modules. It is now
possible to create nested modules withm = Module(submodule, ...), and save
and load them withModule.saveandModule.from_checkpoint.
⚠️ Warning: - The modules have to bephysicsnemo.Modules, and not
torch.nn.Modules. - Support passing custom tokenizer, detokenizer, and attention
Modules in
experimental DiT architecture - Improved Transolver training recipe's configuration for checkpointing and normalization.
Fixed
- Set
skip_scaleto Python float in U-Net to ensure compilation works. - Ensure stream dependencies are handled correctly in physicsnemo.utils.neighbors
- Fixed the issue with incorrect handling of files with consecutive runs of
combine_stl_solids.pyin the X-MGN recipe. - Fixed the
RuntimeError: Worker data receiving interruptederror in the datacenter example.
Contributors
We’re grateful to everyone who contributed issues, feature ideas, fixes, and documentation updates — your input is what helps us continuously improve PhysicsNeMo for the whole community!
A special shout-out to the authors of the pull requests listed above, in no particular order:
@jleinonen , @Dibyajyoti-Chakraborty , @jialusui1102 , @abokov-nv , @melo-gonzo , @dran-dev , @RishikeshRanade , @swbg
Thank you ❤️ — we truly appreciate your contributions and hope to see more from you in the future!
v1.2.0
PhysicsNeMo General Release v1.2.0
Added
- Diffusion Transformer (DiT) model. The DiT model can be accessed in
physicsnemo.experimental.models.dit.DiT.⚠️ Warning: - Experimental feature
subject to future API changes. - Improved documentation for diffusion models and diffusion utils.
- Safe API to override
__init__'s arguments saved in checkpoint file with
Module.from_checkpoint("chkpt.mdlus", override_args=set(...)). - PyTorch Geometric MeshGraphNet backend.
- Functionality in DoMINO to take arbitrary number of
scalarorvector
global parameters and encode them usingclass ParameterModel - TopoDiff model and example.
- Added ability for DoMINO model to return volume neighbors.
- Added functionality in DoMINO recipe to introduce physics residual losses.
- Diffusion models, metrics, and utils: implementation of Student-t
distribution for EDM-based diffusion models (t-EDM). This feature is adapted
from the paper Heavy-Tailed Diffusion Models, Pandey et al..
This includes a new EDM preconditioner (tEDMPrecondSuperRes), a loss
function (tEDMResidualLoss), and a new option in corrdiffdiffusion_step.
⚠️ This is an experimental feature that can be accessed through the
physicsnemo.experimentalmodule; it might also be subjected to API changes
without notice. - Bumped Ruff version from 0.0.290 to 0.12.5. Replaced Black with
ruff-format. - Domino improvements with Unet attention module and user configs
- Hybrid MeshGraphNet for modeling structural deformation
- Enabled TransformerEngine backend in the
transolvermodel. - Inference code for x-meshgraphnet example for external aerodynamics.
- Added a new example for external_aerodynamics: training
transolveron
irregular mesh data for DrivaerML surface data. - Added a new example for external aerodynamics for finetuning pretrained models.
Changed
- Diffusion utils:
physicsnemo.utils.generativerenamed intophysicsnemo.utils.diffusion - Diffusion models: in CorrDiff model wrappers (
EDMPrecondSuperResolutionand
UNet), the argumentsprofile_modeandamp_modecannot be overriden by
from_checkpoint. They are now properties that can be dynamically changed
after the model instantiation with, for example,model.amp_mode = True
andmodel.profile_mode = False. - Updated healpix data module to use correct
DistributedSamplertarget for
test data loader - Existing DGL-based vortex shedding example has been renamed to
vortex_shedding_mgn_dgl.
Added newvortex_shedding_mgnexample that uses PyTorch Geometric instead. - HEALPixLayer can now use earth2grid HEALPix padding ops, if desired
- Migrated Vortex Shedding Reduced Mesh example to PyTorch Geometric.
- CorrDiff example: fixed bugs when training regression
UNet. - Diffusion models: fixed bugs related to gradient checkpointing on non-square
images. - Diffusion models: created a separate class
Attentionfor clarity and
modularity. UpdatedUNetBlockaccordingly to use theAttentionclass
instead of custom attention logic. This will update the model architecture
forSongUNet-based diffusion models. Changes are not BC-breaking and are
transparent to the user. ⚠️ BC-breaking: refactored the automatic mixed precision
(AMP) API in layers and models defined inphysicsnemo/models/diffusion/for
improved usability. Note: it is now, not only possible, but required to
explicitly setmodel.amp_mode = Truein order to use the model in a
torch.autocastclause. This applies to allSongUNet-based models.- Diffusion models: fixed and improved API to enable fp16 forward pass in
UNetandEDMPrecondSuperResolutionmodel wrappers; fp16 forward pass can
now be toggled/untoggled by settingmodel.use_fp16 = True. - Diffusion models: improved API for Apex group norm.
SongUNet-based models
will automatically perform conversion of the input tensors to
torch.channels_lastmemory format whenmodel.use_apex_gnisTrue. New
warnings are raised when attempting to use Apex group norm on CPU. - Diffusion utils: systematic compilation of patching operations in
stochastic_sampler
for improved performance. - CorrDiff example: added option for Student-t EDM (t-EDM) in
train.pyand
generate.py. When training a CorrDiff diffusion model, this feature can be
enabled with the hydra overrides++training.hp.distribution=student_tand
++training.hp.nu_student_t=<nu_value>. For generation, this feature can be
enabled with similar overrides:++generation.distribution=student_tand
++generation.nu_student_t=<nu_value>. - CorrDiff example: the parameters
P_meanandP_std(used to compute the
noise levelsigma) are now configurable. They can be set with the hydra
overrides++training.hp.P_mean=<P_mean_value>and
++training.hp.P_std=<P_std_value>for training (and similar ones with
training.hpreplaced bygenerationfor generation). - Diffusion utils: patch-based inference and lead time support with
deterministic sampler. - Existing DGL-based XAeroNet example has been renamed to
xaeronet_dgl.
Added newxaeronetexample that uses PyTorch Geometric instead. - Updated the deforming plate example to use the Hybrid MeshGraphNet model.
⚠️ BC-breaking: Refactored thetransolvermodel to improve
readability and performance, and extend to more use cases.- Diffusion models: improved lead time support for
SongUNetPosLtEmbdand
EDMLoss. Lead-time embeddings can now be used with/without positional
embeddings. - Diffusion models: consolidate
ApexGroupNormandGroupNormin
models/diffusion/layers.pywith a factoryget_group_normthat can
be used to instantiate either one of them.get_group_normis now the
recommended way to instantiate a GroupNorm layer inSongUNet-based and
other diffusion models. - Physicsnemo models: improved checkpoint loading API in
Module.from_checkpointthat now exposes astrictparameter to raise error
on missing/unexpected keys, similar to that used in
torch.nn.Module.load_state_dict. - Migrated Hybrid MGN and deforming plate example to PyTorch Geometric.
Fixed
- Bug fixes in DoMINO model in sphere sampling and tensor reshaping
- Bug fixes in DoMINO utils random sampling and test.py
- Optimized DoMINO config params based on DrivAer ML
v1.1.1
v1.1.0
PhysicsNeMo (Core) General Release v1.1.0
Added
- Added ReGen score-based data assimilation example
- General purpose patching API for patch-based diffusion
- New positional embedding selection strategy for CorrDiff SongUNet models
- Added Multi-Storage Client to allow checkpointing to/from Object Storage
Changed
- Simplified CorrDiff config files, updated default values
- Refactored CorrDiff losses and samplers to use the patching API
- Support for non-square images and patches in patch-based diffusion
- ERA5 download example updated to use current file format convention and
restricts global statistics computation to the training set - Support for training custom StormCast models and various other improvements for StormCast
- Updated CorrDiff training code to support multiple patch iterations to amortize
regression cost and usage oftorch.compile - Refactored
physicsnemo/models/diffusion/layers.pyto optimize data type
casting workflow, avoiding unnecessary casting under autocast mode - Refactored Conv2d to enable fusion of conv2d with bias addition
- Refactored GroupNorm, UNetBlock, SongUNet, SongUNetPosEmbd to support usage of
Apex GroupNorm, fusion of activation with GroupNorm, and AMP workflow. - Updated SongUNetPosEmbd to avoid unnecessary HtoD Memcpy of
pos_embd - Updated
from_checkpointto accommodate conversion between Apex optimized ckp
and non-optimized ckp - Refactored CorrDiff NVTX annotation workflow to be configurable
- Refactored
ResidualLossto support patch-accumlating training for
amortizing regression costs - Explicit handling of Warp device for ball query and sdf
- Merged SongUNetPosLtEmb with SongUNetPosEmb, add support for batch>1
- Add lead time embedding support for
positional_embedding_selector. Enable
arbitrary positioning of probabilistic variables - Enable lead time aware regression without CE loss
- Bumped minimum PyTorch version from 2.0.0 to 2.4.0, to minimize
support surface forphysicsnemo.distributedfunctionality.
Dependencies
- Made
nvidia.dalian optional dependency
v1.0.1
v1.0.0
PhysicsNeMo (Core) General Release v1.0.0
Added
- DoMINO model architecture, datapipe and training recipe
- Added matrix decomposition scheme to improve graph partitioning
- DrivAerML dataset support in FIGConvNet example.
- Retraining recipe for DoMINO from a pretrained model checkpoint
- Prototype support for domain parallelism of using ShardTensor (new).
- Enable DeviceMesh initialization via DistributedManager.
- Added Datacenter CFD use case.
- Add leave-in profiling utilities to physicsnemo, to easily enable torch/python/nsight
profiling in all aspects of the codebase.
Changed
- Refactored StormCast training example
- Enhancements and bug fixes to DoMINO model and training example
- Enhancement to parameterize DoMINO model with inlet velocity
- Moved non-dimensionaliztion out of domino datapipe to datapipe in domino example
- Updated utils in
physicsnemo.launch.loggingto avoid unnecessarywandbandmlflow
imports - Moved to experiment-based Hydra config in Lagrangian-MGN example
- Make data caching optional in
MeshDatapipe - The use of older
importlib_metadatalibrary is removed
Deprecated
- ProcessGroupConfig is tagged for future deprecation in favor of DeviceMesh.
Fixed
- Update pytests to skip when the required dependencies are not present
- Bug in data processing script in domino training example
- Fixed NCCL_ASYNC_ERROR_HANDLING deprecation warning
Dependencies
- Remove the numpy dependency upper bound
- Moved pytz and nvtx to optional
- Update the base image for the Dockerfile
- Introduce Multi-Storage Client (MSC) as an optional dependency.
- Introduce
wraptas an optional dependency, needed when using
ShardTensor's automatic domain parallelism
v0.9.0
Modulus (core) general release v0.9.0
Added
- FIGConvUNet model and example.
- The Transolver model.
- The XAeroNet model.
- Incoporated CorrDiff-GEFS-HRRR model into CorrDiff, with lead-time aware SongUNet and
cross entropy loss.
Changed
- Refactored EDMPrecondSRV2 preconditioner and fixed the bug related to the metadata
- Extended the checkpointing utility to store metadata.
- Corrected missing export of loggin function used by transolver model