Skip to content

Revamp python API#20

Merged
Neclow merged 8 commits intosbhattlab:mainfrom
Neclow:revamp_python
May 1, 2025
Merged

Revamp python API#20
Neclow merged 8 commits intosbhattlab:mainfrom
Neclow:revamp_python

Conversation

@Neclow
Copy link
Collaborator

@Neclow Neclow commented May 1, 2025

Apologies for the huge PR...

This PR includes a number of breaking changes, but which hopefully simplify the Python API. It adopts the philosophy that the Phylo2Vec vector/matrix objects are central.

With that in mind, the base includes all conversions from a P2V object to other objects (Newick and ancestry matrix). Each file in this module should have a from_xxx and a to_xxx. from_xxx converts xxx to P2V, to_xxx converts P2V to xxx.

The I/O is also simplified to 4 functions: save/load which save and load a vector/matrix. save_newick/load_newick which save and load a file containing a Newick string into a vector. We also separate I/O tests in a separate file.

We also simplify utils to include all operations on different tree objects: a phylo2vec vector (vector.py), a phylo2vec matrix (matrix.py), and Newick strings.

Finally, we simplify API calls to core functions:

The following functions are directly loaded into the main __init__.py:

  • from_ancestry, to_ancestry
  • from_newick, to_newick
  • load, load_newick
  • save, save_newick
  • sample_matrix, sample_vector

@Neclow Neclow requested a review from Copilot May 1, 2025 13:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR revamps the Python API for Phylo2Vec, simplifying conversions and I/O by centralizing vector/matrix operations and deprecating older utility modules. Key updates include removal of outdated utility files, introduction of consolidated modules for I/O and matrix operations, and the reorganization of public API functions.

  • Removal of validation, random, and legacy I/O modules in favor of new reader/writer modules.
  • Introduction of new base conversion functions (from_newick, to_newick, from_ancestry, to_ancestry) and updated benchmarks.
  • Updates to documentation and README to reflect the new API structure.

Reviewed Changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
py-phylo2vec/phylo2vec/utils/validation.py Deprecated validation functions are removed.
py-phylo2vec/phylo2vec/utils/random.py Legacy random utility functions removed.
py-phylo2vec/phylo2vec/utils/matrix.py New matrix operations with validation and sampling added.
py-phylo2vec/phylo2vec/utils/io.py Legacy I/O module removed in favor of new reader/writer modules.
py-phylo2vec/phylo2vec/utils/init.py Removed; reorganization of utility modules.
py-phylo2vec/phylo2vec/opt/_base.py Updated seeding to directly use random and numpy instead of seed_everything.
py-phylo2vec/phylo2vec/metrics/pairwise.py Updated to use check_vector and added docstrings.
py-phylo2vec/phylo2vec/io/writer.py New writer module for saving Newick and array files with minor docstring issues.
py-phylo2vec/phylo2vec/io/reader.py New reader module for loading vectors/matrices and Newick trees.
py-phylo2vec/phylo2vec/io/_validation.py New file validation functions with a minor typo in a docstring.
py-phylo2vec/phylo2vec/base/newick.py Added conversion functions between Newick and Phylo2Vec representations.
py-phylo2vec/phylo2vec/base/ancestry.py Updated ancestry conversion functions and API naming.
py-phylo2vec/phylo2vec/init.py Updated public API re-exports to reflect the new function names and modules.
py-phylo2vec/benchmarks/test_bench.py Added and improved docstrings for benchmark tests.
docs/_config.yml & README.md Updated documentation metadata and usage examples to align with the new API design.
Files not reviewed (1)
  • docs/api.rst: Language not supported
Comments suppressed due to low confidence (1)

py-phylo2vec/phylo2vec/io/writer.py:25

  • The docstring for the 'labels' parameter is incomplete ('A mapping of integer labels to , by default None'). Please update it to clearly state that it maps integer labels to taxa or similar descriptive terms.
labels : Optional[Dict[int, str]] = None,

@Neclow Neclow changed the title Revamp python Revamp python API May 1, 2025
@Neclow Neclow requested a review from Copilot May 1, 2025 13:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR revamps the Python API in Phylo2Vec by simplifying and reorganizing the conversion functions and I/O modules while deprecating legacy modules. Key changes include:

  • Removing or refactoring legacy modules (e.g. random, io, to_vector, to_newick) in favor of a more centralized API.
  • Introducing new modules for matrix operations, I/O (reader/writer), and ancestry conversion.
  • Updating the public init.py to re-export a streamlined set of functions for user convenience.

Reviewed Changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated no comments.

Show a summary per file
File Description
py-phylo2vec/phylo2vec/utils/random.py Entire module removed, indicating removal of legacy random utilities.
py-phylo2vec/phylo2vec/utils/matrix.py New module added for matrix operations and validations.
py-phylo2vec/phylo2vec/utils/io.py Legacy I/O module removed in favor of new reader/writer modules.
py-phylo2vec/phylo2vec/opt/_base.py Updated seeding calls using direct random and numpy seeding.
py-phylo2vec/phylo2vec/metrics/pairwise.py Updated documentation and validation function usage.
py-phylo2vec/phylo2vec/io/writer.py, reader.py, _validation.py New modules for file I/O with enhanced validation.
py-phylo2vec/phylo2vec/base/(newick, ancestry).py New modules for conversion between Newick strings, vectors and ancestry matrices.
py-phylo2vec/phylo2vec/init.py Public API updated to re-export core functions.
py-phylo2vec/benchmarks/test_bench.py Updated benchmark tests and docstrings for clarity.
docs/_config.yml, README.md Updated documentation to reflect changes in API structure and repository ownership.
Files not reviewed (1)
  • docs/api.rst: Language not supported
Comments suppressed due to low confidence (1)

py-phylo2vec/benchmarks/test_bench.py:39

  • The docstring for test_to_vector is misleading; it indicates a benchmark for converting a vector to Newick format, but the test actually benchmarks converting a Newick string back to a vector. Consider updating the docstring to accurately reflect the functionality.
def test_to_vector(benchmark, sample_size):

@Neclow Neclow requested a review from Copilot May 1, 2025 13:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR revamps and simplifies the Python API for Phylo2Vec by streamlining conversions and I/O operations, and by reorganizing utility functions into clearly defined modules.

  • Removed legacy modules (e.g., utils/random, base/to_vector, base/to_newick) in favor of specialized and consolidated modules.
  • Introduced new modules for matrix operations, I/O (reader/writer and validation), and ancestry conversions; updated init.py and documentation accordingly.

Reviewed Changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
py-phylo2vec/phylo2vec/utils/random.py Removed legacy random utilities now consolidated elsewhere
py-phylo2vec/phylo2vec/utils/matrix.py Introduces new matrix utility functions for sampling and checks
py-phylo2vec/phylo2vec/utils/io.py Removed old I/O helpers in favor of new reader/writer modules
py-phylo2vec/phylo2vec/opt/_base.py Updated seeding approach with direct calls to random and NumPy
py-phylo2vec/phylo2vec/metrics/pairwise.py Enhanced docstrings and switched to a more specific validation
py-phylo2vec/phylo2vec/io/writer.py and reader.py New I/O modules to handle Newick and array file formats
py-phylo2vec/phylo2vec/base/* Removed deprecated base conversion files; introduced ancestry
py-phylo2vec/phylo2vec/init.py Reorganized public API to reflect the simplified structure
docs/_config.yml and README.md Updated documentation and examples to match the new API
Files not reviewed (1)
  • docs/api.rst: Language not supported

import phylo2vec._phylo2vec_core as core


def from_ancestry(ancestry: np.ndarray) -> np.ndarray:
Copy link

Copilot AI May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for from_ancestry is inconsistent: it states that it converts an ancestry matrix to a vector, but the Returns section describes an ancestry matrix. Please update the documentation to accurately reflect whether the function returns a vector or an ancestry matrix.

Copilot uses AI. Check for mistakes.
@Neclow Neclow requested a review from Copilot May 1, 2025 14:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR revamps the Python API for Phylo2Vec by reorganizing and simplifying conversion and I/O functions and consolidating various utility modules. Key changes include:

  • Removal and refactoring of several modules (e.g. random.py, to_vector.py, to_newick.py) in favor of more consistent API functions.
  • Introduction of new modules for matrix manipulation, I/O (reader/writer), and ancestry conversion.
  • Updates to documentation and examples to reflect the new API design.

Reviewed Changes

Copilot reviewed 26 out of 27 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
py-phylo2vec/phylo2vec/utils/random.py Removed random utility functions; seeding is now handled directly in other modules.
py-phylo2vec/phylo2vec/opt/_base.py Refactored seeding by replacing the call to seed_everything with individual seed calls.
py-phylo2vec/phylo2vec/base/to_vector.py Removed in favor of the consolidated from_newick API.
py-phylo2vec/phylo2vec/base/to_newick.py Removed; conversion to Newick is now handled by base/newick.py.
README.md Updated examples to import from the new API and replace deprecated function names.
Files not reviewed (1)
  • docs/api.rst: Language not supported
Comments suppressed due to low confidence (3)

py-phylo2vec/phylo2vec/utils/random.py:1

  • The removal of this file consolidates random seeding functionality; ensure that all consumers of these functions have been updated to use the new seeding mechanism.
Entire file removed

py-phylo2vec/phylo2vec/base/to_vector.py:1

  • The to_vector functionality has been removed in favor of from_newick; verify that documentation and dependent code reflect this breaking change.
Entire file removed

py-phylo2vec/phylo2vec/base/to_newick.py:1

  • Removal of this file means that conversion to Newick should now exclusively be handled via base/newick.py; ensure all user references are updated accordingly.
Entire file removed

Comment on lines +28 to 31
random.seed(self.random_seed)
np.random.seed(self.random_seed)

@staticmethod
Copy link

Copilot AI May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The removal of seed_everything leads to duplicated seeding calls; consider reintroducing a consolidated seeding utility for clearer intent and consistency.

Suggested change
random.seed(self.random_seed)
np.random.seed(self.random_seed)
@staticmethod
self.seed_everything(self.random_seed)
@staticmethod
def seed_everything(seed):
"""
Set the seed for all relevant random number generators.
Parameters
----------
seed : int
The seed value to use.
"""
random.seed(seed)
np.random.seed(seed)
@staticmethod

Copilot uses AI. Check for mistakes.
@Neclow Neclow merged commit 7ff6736 into sbhattlab:main May 1, 2025
7 checks passed
@Neclow Neclow deleted the revamp_python branch May 14, 2025 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants