Skip to content

HaoZeke/rgpycrumbs

Repository files navigation

Table of Contents

About

img

Tests Linting Docs PyPI Python License: MIT One Good Tutorial docs checklist v1: adopted Benchmarks Hatch project DOI

A pure-python computational library and CLI toolkit for chemical physics research. rgpycrumbs provides both importable library modules for computational tasks (surface fitting, structure analysis, interpolation) and a dispatcher-based CLI for running self-contained research scripts.

Heavy optional dependencies (JAX, SciPy, ASE) are resolved lazily at first use. A bare pip install rgpycrumbs gives the full API surface; the actual backend libraries load on demand from the current environment, a shared cache, or (with RGPYCRUMBS_AUTO_DEPS=1) via automatic uv installation. CUDA-aware resolution avoids pulling GPU libraries on CPU-only machines.

The library side offers:

  • Surface fitting (rgpycrumbs.surfaces) – JAX-based kernel methods (TPS, RBF, Matern, SE, IMQ) with gradient-enhanced variants for energy landscape interpolation
  • Structure analysis (rgpycrumbs.geom.analysis) – distance matrices, bond matrices, and fragment detection via ASE
  • IRA matching (rgpycrumbs.geom.ira) – iterative rotations and assignments for RMSD-based structure comparison
  • Interpolation (rgpycrumbs.interpolation) – spline interpolation utilities
  • Data types (rgpycrumbs.basetypes) – shared data structures for NEB paths, saddle searches, and molecular geometries

The CLI tools rely on optional dependencies fetched on-demand via PEP 723 + uv.

Ecosystem Overview

rgpycrumbs is the central hub of an interlinked suite of libraries.

img

CLI Design Philosophy

The library is designed with the following principles in mind:

  • Dispatcher-Based Architecture: The top-level rgpycrumbs.cli command acts as a lightweight dispatcher. It does not contain the core logic of the tools itself. Instead, it parses user commands to identify the target script and then invokes it in an isolated subprocess using the uv runner. This provides a unified command-line interface while keeping the tools decoupled.

  • Isolated & Reproducible Execution: Each script is a self-contained unit that declares its own dependencies via PEP 723 metadata. The uv runner uses this information to resolve and install the exact required packages into a temporary, cached environment on-demand. This design guarantees reproducibility and completely eliminates the risk of dependency conflicts between different tools in the collection.

  • Lightweight Core, On-Demand Dependencies: The installable rgpycrumbs package has minimal core dependencies (click, numpy). Heavy scientific libraries are available as optional extras (e.g. pip install rgpycrumbs[surfaces] for JAX). For CLI tools, dependencies are fetched by uv only when a script that needs them is executed. For library modules, ensure_import resolves dependencies at first use when RGPYCRUMBS_AUTO_DEPS=1 is set, with CUDA-aware resolution that avoids pulling GPU libraries on CPU-only machines. The base installation stays lightweight either way.

  • Modular & Extensible Tooling: Each utility is an independent script. This modularity simplifies development, testing, and maintenance, as changes to one tool cannot inadvertently affect another. New tools can be added to the collection without modifying the core dispatcher logic, making the system easily extensible.

Usage

Library API

The library modules can be imported directly. Dependencies resolve automatically when RGPYCRUMBS_AUTO_DEPS=1 is set (requires uv on PATH), or install extras explicitly:

# Surface fitting (requires jax: pip install rgpycrumbs[surfaces])
from rgpycrumbs.surfaces import get_surface_model
model = get_surface_model("tps")

# Structure analysis (requires ase, scipy: pip install rgpycrumbs[analysis])
from rgpycrumbs.geom.analysis import analyze_structure

# Spline interpolation (requires scipy: pip install rgpycrumbs[interpolation])
from rgpycrumbs.interpolation import spline_interp

# Data types (no extra deps)
from rgpycrumbs.basetypes import nebpath, SaddleMeasure

CLI Tools

The general command structure is:

python -m rgpycrumbs.cli [subcommand-group] [script-name] [script-options]

You can see the list of available command groups:

$ python -m rgpycrumbs.cli --help
Usage: rgpycrumbs [OPTIONS] COMMAND [ARGS]...

  A dispatcher that runs self-contained scripts using 'uv'.

Options:
  --help  Show this message and exit.

Commands:
  eon  Dispatches to a script within the 'eon' submodule.

eOn

  • Plotting NEB Paths (plt-neb)

    This script visualizes the energy landscape of Nudged Elastic Band (NEB) calculations, generating 2D surface plots with optional structure rendering.

    The default grad_imq method uses gradient-enhanced Inverse Multiquadric interpolation on 2D RMSD projections [1]. The approach projects high-dimensional structures onto 2D coordinates (reactant distance r vs product distance p) and fits a smooth surface using energy values and their gradients.

    [1] R. Goswami, "Two-dimensional RMSD projections for reaction path visualization and validation," MethodsX, p. 103851, Mar. 2026, doi: 10.1016/j.mex.2026.103851. generating 2D surface plots with optional structure rendering.

    • Basic Usage

      python -m rgpycrumbs eon plt-neb --con-file trajectory.con --plot-type landscape -o neb_landscape.png
      
    • Key Options

      Option Description Default
      --con-file PATH Trajectory file with NEB images None
      --plot-type landscape (2D surface) or profile profile
      --surface-type Surface method: grad_imq, grad_matern, grad_imq_ny, rbf grad_imq
      --project-path / --no-project-path Project to reaction valley coordinates --project-path
      --plot-structures Structure strip: none, all, crit_points none
      --show-legend Show colorbar legend Off
      --show-pts / --no-show-pts Show data points on surface --show-pts
      --landscape-path Path overlay: final, all, none final
      --ira-kmax kmax factor for IRA RMSD calculation 1.8
      -o PATH Output image filename None (display)

      Use --landscape-path all to overlay all optimization steps and visualize convergence. This shows the full trajectory from initial guess to final path [1].

      --show-pts / --no-show-pts Show data points on surface --show-pts
      --ira-kmax kmax factor for IRA RMSD calculation 1.8
      -o PATH Output image filename None (display)
    • Examples

      Full landscape with gradient-enhanced IMQ surface and critical point structures:

      python -m rgpycrumbs eon plt-neb \
        --con-file neb.con \
        --plot-type landscape \
        --project-path \
        --plot-structures crit_points \
        --surface-type grad_imq \
        --ira-kmax 14 \
        --show-legend \
        -o neb_landscape.png
      

      Surface-only plot without structure strip:

      python -m rgpycrumbs eon plt-neb \
        --plot-type landscape \
        --surface-type grad_matern \
        --no-show-pts \
        -o surface.png
      

      Convergence visualization with all optimization steps:

      python -m rgpycrumbs eon plt-neb \
        --con-file neb.con \
        --plot-type landscape \
        --landscape-path all \
        --surface-type grad_imq \
        --show-legend \
        -o neb_convergence.png
      
    • Surface Methods

      • grad_imq: Gradient-enhanced Inverse Multiquadric (recommended, uses energy + gradients)
      • grad_matern: Gradient-enhanced Matérn 5/2 (uses energy + gradients)
      • grad_imq_ny: Nystrom-approximated gradimq for large datasets (>1000 points)
      • rbf: Radial Basis Function / Thin Plate Spline (fast, no gradients)
  • Splitting CON files (con-splitter)

    This script takes a multi-image trajectory file (e.g., from a finished NEB calculation) and splits it into individual frame files, creating an input file for a new calculation.

    To split a trajectory file:

    rgpycrumbs eon con-splitter neb_final_path.con -o initial_images
    

    This will create a directory named initial_images containing ipath_000.con, ipath_001.con, etc., along with an ipath.dat file listing their paths.

Contributing

All contributions are welcome, but for the CLI tools please follow established best practices.

Development

This project uses uv as the primary development tool with hatchling + hatch-vcs for building and versioning.

# Clone and install in development mode with test dependencies
uv sync --extra test

# Run the pure tests (no heavy optional deps)
uv run pytest -m pure

# Run interpolation tests (needs scipy)
uv run --extra interpolation pytest -m interpolation

Branch Structure

Development happens on the main branch. The readme branch is an auto-generated orphan containing only the rendered README.md and branding assets; it is the GitHub default branch.

When is pixi needed?

Pixi is only needed for features that require conda-only packages (not available on PyPI):

  • fragments tests: need tblite, ira, pyvista (conda)
  • surfaces tests: may prefer conda jax builds

For everything else, uv is sufficient.

Versioning

Versions are derived automatically from git tags via hatch-vcs (setuptools-scm). There is no manual version field; the version is the latest tag (e.g. v1.0.01.0.0). Between tags, dev versions are generated automatically (e.g. 1.0.1.dev3+gabcdef).

Release Process

# 1. Ensure tests pass
uv run --extra test pytest -m pure

# 2. Build changelog (uses towncrier fragments in docs/newsfragments/)
uvx towncrier build --version "v1.0.0"

# 3. Commit the changelog
git add CHANGELOG.rst && git commit -m "doc: release notes for v1.0.0"

# 4. Tag the release (hatch-vcs derives the version from this tag)
git tag -a v1.0.0 -m "Version 1.0.0"

# 5. Build and publish
uv build
uvx twine upload dist/*

License

MIT. However, this is an academic resource, so please cite as much as possible via:

  • The Zenodo DOI for general use.
  • The wailord paper for ORCA usage

About

A dispatcher-based analytical and visualization suite for chemical physics

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

 
 
 

Contributors