Fix output formatting for PEER evaluation callback by taylormjs · Pull Request #152 · prescient-design/lobster

taylormjs · 2025-07-18T22:20:03Z

Convert NumPy scalars to Python types for clean YAML formatting
Use clean enum values (e.g., 'bindingdb') instead of 'PEERTask.BINDINGDB'
Move convert_numpy_to_python utility to _peer_utils.py for reusability
Ensure consistent markdown formatting with other evaluation callbacks

This makes PEER evaluation results display cleanly in evaluation reports, matching the format of CALM and MoleculeACE callbacks.

Description

Brief description of changes made

Type of Change

Testing

Tests pass locally
Added new tests for new functionality
Updated existing tests if needed

Checklist

Code follows style guidelines
Self-review completed
Documentation updated if needed
No breaking changes (or clearly documented)

- Convert NumPy scalars to Python types for clean YAML formatting - Use clean enum values (e.g., 'bindingdb') instead of 'PEERTask.BINDINGDB' - Move convert_numpy_to_python utility to _peer_utils.py for reusability - Ensure consistent markdown formatting with other evaluation callbacks This makes PEER evaluation results display cleanly in evaluation reports, matching the format of CALM and MoleculeACE callbacks.

- Fix cache key assertions to use task.value instead of str(task) - Fix probe storage assertions to use task.value instead of str(task) - Ensure all tests pass with updated PEER callback implementation

ncfrey · 2025-07-21T16:35:17Z

src/lobster/callbacks/_peer_utils.py


+def convert_numpy_to_python(obj):
+    """Recursively convert NumPy scalars to Python types for clean YAML formatting."""
+    if isinstance(obj, dict):


use match/case here ?

- Updated test splits to use only most important split per task: * humanppi/yeastppi: only 'test' (removed cross_species_test) * bindingdb: only 'holdout_test' (removed random_test) * fold: only 'test_superfamily_holdout' for remote homology detection * secondarystructure: only 'cb513' benchmark - Removed categories section and task averaging logic since each task now has single split - Added descriptive comments to fold and secondarystructure task constants - Updated convert_numpy_to_python to use modern match-case syntax - Updated tests to match simplified evaluation behavior - Kept global mean aggregation across all tasks

Taylor Joren added 2 commits July 18, 2025 22:16

Update PEER evaluation callback tests for task naming changes

bb954e5

- Fix cache key assertions to use task.value instead of str(task) - Fix probe storage assertions to use task.value instead of str(task) - Ensure all tests pass with updated PEER callback implementation

taylormjs temporarily deployed to test.pypi.org July 18, 2025 22:27 — with GitHub Actions Inactive

taylormjs marked this pull request as ready for review July 18, 2025 22:55

taylormjs requested a review from karinazad July 21, 2025 15:45

ncfrey approved these changes Jul 21, 2025

View reviewed changes

taylormjs temporarily deployed to test.pypi.org July 21, 2025 18:35 — with GitHub Actions Inactive

taylormjs merged commit f53a078 into main Jul 21, 2025
5 checks passed

taylormjs deleted the t/eval-outputs branch July 21, 2025 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix output formatting for PEER evaluation callback#152

Fix output formatting for PEER evaluation callback#152
taylormjs merged 3 commits intomainfrom
t/eval-outputs

taylormjs commented Jul 18, 2025

Uh oh!

ncfrey Jul 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

taylormjs commented Jul 18, 2025

Description

Type of Change

Testing

Checklist

Uh oh!

ncfrey Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants