Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR revamps the Python API for Phylo2Vec, simplifying conversions and I/O by centralizing vector/matrix operations and deprecating older utility modules. Key updates include removal of outdated utility files, introduction of consolidated modules for I/O and matrix operations, and the reorganization of public API functions.
- Removal of validation, random, and legacy I/O modules in favor of new reader/writer modules.
- Introduction of new base conversion functions (from_newick, to_newick, from_ancestry, to_ancestry) and updated benchmarks.
- Updates to documentation and README to reflect the new API structure.
Reviewed Changes
Copilot reviewed 25 out of 26 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| py-phylo2vec/phylo2vec/utils/validation.py | Deprecated validation functions are removed. |
| py-phylo2vec/phylo2vec/utils/random.py | Legacy random utility functions removed. |
| py-phylo2vec/phylo2vec/utils/matrix.py | New matrix operations with validation and sampling added. |
| py-phylo2vec/phylo2vec/utils/io.py | Legacy I/O module removed in favor of new reader/writer modules. |
| py-phylo2vec/phylo2vec/utils/init.py | Removed; reorganization of utility modules. |
| py-phylo2vec/phylo2vec/opt/_base.py | Updated seeding to directly use random and numpy instead of seed_everything. |
| py-phylo2vec/phylo2vec/metrics/pairwise.py | Updated to use check_vector and added docstrings. |
| py-phylo2vec/phylo2vec/io/writer.py | New writer module for saving Newick and array files with minor docstring issues. |
| py-phylo2vec/phylo2vec/io/reader.py | New reader module for loading vectors/matrices and Newick trees. |
| py-phylo2vec/phylo2vec/io/_validation.py | New file validation functions with a minor typo in a docstring. |
| py-phylo2vec/phylo2vec/base/newick.py | Added conversion functions between Newick and Phylo2Vec representations. |
| py-phylo2vec/phylo2vec/base/ancestry.py | Updated ancestry conversion functions and API naming. |
| py-phylo2vec/phylo2vec/init.py | Updated public API re-exports to reflect the new function names and modules. |
| py-phylo2vec/benchmarks/test_bench.py | Added and improved docstrings for benchmark tests. |
| docs/_config.yml & README.md | Updated documentation metadata and usage examples to align with the new API design. |
Files not reviewed (1)
- docs/api.rst: Language not supported
Comments suppressed due to low confidence (1)
py-phylo2vec/phylo2vec/io/writer.py:25
- The docstring for the 'labels' parameter is incomplete ('A mapping of integer labels to , by default None'). Please update it to clearly state that it maps integer labels to taxa or similar descriptive terms.
labels : Optional[Dict[int, str]] = None,
There was a problem hiding this comment.
Pull Request Overview
This PR revamps the Python API in Phylo2Vec by simplifying and reorganizing the conversion functions and I/O modules while deprecating legacy modules. Key changes include:
- Removing or refactoring legacy modules (e.g. random, io, to_vector, to_newick) in favor of a more centralized API.
- Introducing new modules for matrix operations, I/O (reader/writer), and ancestry conversion.
- Updating the public init.py to re-export a streamlined set of functions for user convenience.
Reviewed Changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| py-phylo2vec/phylo2vec/utils/random.py | Entire module removed, indicating removal of legacy random utilities. |
| py-phylo2vec/phylo2vec/utils/matrix.py | New module added for matrix operations and validations. |
| py-phylo2vec/phylo2vec/utils/io.py | Legacy I/O module removed in favor of new reader/writer modules. |
| py-phylo2vec/phylo2vec/opt/_base.py | Updated seeding calls using direct random and numpy seeding. |
| py-phylo2vec/phylo2vec/metrics/pairwise.py | Updated documentation and validation function usage. |
| py-phylo2vec/phylo2vec/io/writer.py, reader.py, _validation.py | New modules for file I/O with enhanced validation. |
| py-phylo2vec/phylo2vec/base/(newick, ancestry).py | New modules for conversion between Newick strings, vectors and ancestry matrices. |
| py-phylo2vec/phylo2vec/init.py | Public API updated to re-export core functions. |
| py-phylo2vec/benchmarks/test_bench.py | Updated benchmark tests and docstrings for clarity. |
| docs/_config.yml, README.md | Updated documentation to reflect changes in API structure and repository ownership. |
Files not reviewed (1)
- docs/api.rst: Language not supported
Comments suppressed due to low confidence (1)
py-phylo2vec/benchmarks/test_bench.py:39
- The docstring for test_to_vector is misleading; it indicates a benchmark for converting a vector to Newick format, but the test actually benchmarks converting a Newick string back to a vector. Consider updating the docstring to accurately reflect the functionality.
def test_to_vector(benchmark, sample_size):
There was a problem hiding this comment.
Pull Request Overview
This PR revamps and simplifies the Python API for Phylo2Vec by streamlining conversions and I/O operations, and by reorganizing utility functions into clearly defined modules.
- Removed legacy modules (e.g., utils/random, base/to_vector, base/to_newick) in favor of specialized and consolidated modules.
- Introduced new modules for matrix operations, I/O (reader/writer and validation), and ancestry conversions; updated init.py and documentation accordingly.
Reviewed Changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| py-phylo2vec/phylo2vec/utils/random.py | Removed legacy random utilities now consolidated elsewhere |
| py-phylo2vec/phylo2vec/utils/matrix.py | Introduces new matrix utility functions for sampling and checks |
| py-phylo2vec/phylo2vec/utils/io.py | Removed old I/O helpers in favor of new reader/writer modules |
| py-phylo2vec/phylo2vec/opt/_base.py | Updated seeding approach with direct calls to random and NumPy |
| py-phylo2vec/phylo2vec/metrics/pairwise.py | Enhanced docstrings and switched to a more specific validation |
| py-phylo2vec/phylo2vec/io/writer.py and reader.py | New I/O modules to handle Newick and array file formats |
| py-phylo2vec/phylo2vec/base/* | Removed deprecated base conversion files; introduced ancestry |
| py-phylo2vec/phylo2vec/init.py | Reorganized public API to reflect the simplified structure |
| docs/_config.yml and README.md | Updated documentation and examples to match the new API |
Files not reviewed (1)
- docs/api.rst: Language not supported
| import phylo2vec._phylo2vec_core as core | ||
|
|
||
|
|
||
| def from_ancestry(ancestry: np.ndarray) -> np.ndarray: |
There was a problem hiding this comment.
The docstring for from_ancestry is inconsistent: it states that it converts an ancestry matrix to a vector, but the Returns section describes an ancestry matrix. Please update the documentation to accurately reflect whether the function returns a vector or an ancestry matrix.
There was a problem hiding this comment.
Pull Request Overview
This PR revamps the Python API for Phylo2Vec by reorganizing and simplifying conversion and I/O functions and consolidating various utility modules. Key changes include:
- Removal and refactoring of several modules (e.g. random.py, to_vector.py, to_newick.py) in favor of more consistent API functions.
- Introduction of new modules for matrix manipulation, I/O (reader/writer), and ancestry conversion.
- Updates to documentation and examples to reflect the new API design.
Reviewed Changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| py-phylo2vec/phylo2vec/utils/random.py | Removed random utility functions; seeding is now handled directly in other modules. |
| py-phylo2vec/phylo2vec/opt/_base.py | Refactored seeding by replacing the call to seed_everything with individual seed calls. |
| py-phylo2vec/phylo2vec/base/to_vector.py | Removed in favor of the consolidated from_newick API. |
| py-phylo2vec/phylo2vec/base/to_newick.py | Removed; conversion to Newick is now handled by base/newick.py. |
| README.md | Updated examples to import from the new API and replace deprecated function names. |
Files not reviewed (1)
- docs/api.rst: Language not supported
Comments suppressed due to low confidence (3)
py-phylo2vec/phylo2vec/utils/random.py:1
- The removal of this file consolidates random seeding functionality; ensure that all consumers of these functions have been updated to use the new seeding mechanism.
Entire file removed
py-phylo2vec/phylo2vec/base/to_vector.py:1
- The to_vector functionality has been removed in favor of from_newick; verify that documentation and dependent code reflect this breaking change.
Entire file removed
py-phylo2vec/phylo2vec/base/to_newick.py:1
- Removal of this file means that conversion to Newick should now exclusively be handled via base/newick.py; ensure all user references are updated accordingly.
Entire file removed
| random.seed(self.random_seed) | ||
| np.random.seed(self.random_seed) | ||
|
|
||
| @staticmethod |
There was a problem hiding this comment.
[nitpick] The removal of seed_everything leads to duplicated seeding calls; consider reintroducing a consolidated seeding utility for clearer intent and consistency.
| random.seed(self.random_seed) | |
| np.random.seed(self.random_seed) | |
| @staticmethod | |
| self.seed_everything(self.random_seed) | |
| @staticmethod | |
| def seed_everything(seed): | |
| """ | |
| Set the seed for all relevant random number generators. | |
| Parameters | |
| ---------- | |
| seed : int | |
| The seed value to use. | |
| """ | |
| random.seed(seed) | |
| np.random.seed(seed) | |
| @staticmethod |
Apologies for the huge PR...
This PR includes a number of breaking changes, but which hopefully simplify the Python API. It adopts the philosophy that the Phylo2Vec vector/matrix objects are central.
With that in mind, the
baseincludes all conversions from a P2V object to other objects (Newick and ancestry matrix). Each file in this module should have afrom_xxxand ato_xxx. from_xxx converts xxx to P2V, to_xxx converts P2V to xxx.The I/O is also simplified to 4 functions:
save/loadwhich save and load a vector/matrix.save_newick/load_newickwhich save and load a file containing a Newick string into a vector. We also separate I/O tests in a separate file.We also simplify
utilsto include all operations on different tree objects: a phylo2vec vector (vector.py), a phylo2vec matrix (matrix.py), and Newick strings.Finally, we simplify API calls to core functions:
The following functions are directly loaded into the main
__init__.py: