Skip to content
This repository was archived by the owner on Mar 10, 2026. It is now read-only.

feat(core): Implement Geometry and Identifier Utilities#10

Merged
TKanX merged 28 commits intomainfrom
feature/9-implement-foundational-utils-module-for-geometry-and-parsing
Jul 5, 2025
Merged

feat(core): Implement Geometry and Identifier Utilities#10
TKanX merged 28 commits intomainfrom
feature/9-implement-foundational-utils-module-for-geometry-and-parsing

Conversation

@TKanX
Copy link
Member

@TKanX TKanX commented Jul 5, 2025

Summary:

Introduces a foundational utils module containing essential tools for geometric calculations and standardized atom/residue identification. It includes a geometry submodule with functions for structural analysis and manipulation (e.g., dihedral angles, RMSD) and an identifiers submodule for fast, standardized atom classification and sorting using perfect hash functions. Additionally, enhances the core data model by introducing a strongly-typed ResidueType enum, providing a more robust and semantic understanding of standard amino acids.

Changes:

  • Added Geometry Utilities (utils::geometry):

    • Implemented functions for core geometric calculations using nalgebra:
      • dihedral_angle: Calculates the dihedral angle between four points.
      • calculate_cb_position: Computes the C-beta atom position based on backbone geometry.
      • calculate_hn_position: Determines the position of the amide proton.
      • generate_sp3_hydrogens: Generates hydrogen atoms for sp3-hybridized centers.
    • Provided structural analysis metrics:
      • calculate_rmsd: Calculates the Root Mean Square Deviation between two sets of coordinates.
      • calculate_named_rmsd: A variant of RMSD that operates on named atoms in HashMap.
      • find_max_atom_deviation: Finds the atom with the largest positional difference.
    • Included a comprehensive suite of unit tests for all geometry functions.
  • Implemented Identifier Utilities (utils::identifiers):

    • Used the phf crate to create high-performance, compile-time static maps and sets for atom name lookups.
    • is_backbone_atom: A function to quickly check if an atom is part of the protein backbone.
    • is_heavy_atom: A utility to distinguish heavy atoms from hydrogens.
    • residue_atom_order: Provides a canonical sorting order for atoms within a residue, crucial for standardized file formats and analysis.
  • Introduced a Strong Typing System for Residues:

    • Created a ResidueType enum to represent standard amino acids (e.g., Alanine, Glycine) and their variants (e.g., Histidine protonation states).
    • Implemented FromStr for parsing three-letter codes into ResidueType.
    • Updated the Residue struct to include an Option<ResidueType>, adding rich semantic information.
    • Modified the MolecularSystem and BgfFile parser to handle and store this new ResidueType.
  • Added Dependencies and Scaffolding:

    • Added phf as a new dependency for identifier utilities.

TKanX added 25 commits July 3, 2025 13:14
…SP3 hydrogen positions based on neighbor vectors
… maximum deviation between two sets of named coordinates
@TKanX TKanX self-assigned this Jul 5, 2025
Copilot AI review requested due to automatic review settings July 5, 2025 02:28
@TKanX TKanX added the enhancement ✨ New feature or request label Jul 5, 2025
@TKanX TKanX linked an issue Jul 5, 2025 that may be closed by this pull request
16 tasks
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds a new utils module for geometric and identifier utilities, and extends the core data model with a strongly-typed ResidueType.

  • Introduces utils::geometry for structural calculations (dihedral angles, RMSD, hydrogen placement).
  • Implements utils::identifiers with PHF-based maps to classify and order atoms.
  • Extends Residue and MolecularSystem to carry an optional ResidueType, updating parser and tests accordingly.

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/scream-core/src/core/utils/mod.rs Exposes new geometry and identifiers submodules
crates/scream-core/src/core/utils/identifiers.rs Adds fast, static maps and lookup/order functions for atom names
crates/scream-core/src/core/utils/geometry.rs Implements geometric functions and comprehensive unit tests
crates/scream-core/src/core/models/system.rs Updates add_residue to accept res_type, adjusts calls/tests
crates/scream-core/src/core/models/residue.rs Introduces ResidueType enum, parsing, and extends Residue API
crates/scream-core/src/core/io/bgf.rs Passes parsed res_type into the system when reading BGF files
crates/scream-core/src/core/mod.rs Registers the utils module
crates/scream-core/Cargo.toml Adds phf dependency for identifier utilities
Comments suppressed due to low confidence (2)

crates/scream-core/src/core/models/residue.rs:146

  • Add a unit test for Residue::new to verify that the res_type field is initialized correctly when provided.
    pub(crate) fn new(

crates/scream-core/src/core/utils/geometry.rs:1

  • [nitpick] Consider adding Rust doc comments for public functions and types in the geometry module to improve documentation and discoverability of the utilities.
use nalgebra::{Point3, Rotation3, Unit, Vector3};

@TKanX TKanX merged commit a6621d4 into main Jul 5, 2025
2 checks passed
@TKanX TKanX deleted the feature/9-implement-foundational-utils-module-for-geometry-and-parsing branch July 5, 2025 02:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement ✨ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Foundational utils Module for Geometry and Parsing

2 participants