Skip to content

comparing fa2#112

Merged
ncfrey merged 6 commits intomainfrom
n/comparing-fa2
Jun 17, 2025
Merged

comparing fa2#112
ncfrey merged 6 commits intomainfrom
n/comparing-fa2

Conversation

@ncfrey
Copy link
Contributor

@ncfrey ncfrey commented Jun 17, 2025

No description provided.

@ncfrey ncfrey requested a review from Copilot June 17, 2025 17:50
@ncfrey ncfrey temporarily deployed to test.pypi.org June 17, 2025 17:50 — with GitHub Actions Inactive
@ncfrey ncfrey requested a review from karinazad June 17, 2025 17:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refines the attention and padding configuration for FA2 (flash vs SDPA attention) and adds helper scripts for comparing embedding consistency across modes.

  • Enhanced type safety with Literal for attention_layer and padding in FlexBertConfig
  • Unified padding/attention-mask logic in Ume.__init__ and new Ume.load_from_checkpoint
  • Added example and test scripts to validate SDPA vs flash-attention embedding consistency

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/lobster/model/modern_bert/_config.py Added Literal type hints for attention_layer and padding parameters
src/lobster/model/_ume.py Refactored padding/mask logic; introduced load_from_checkpoint method
src/lobster/model/README.md Documented UME model and its dynamic checkpoint loader
slurm/scripts/test_fa2_ume.sh New SLURM submission script for SDPA vs flash‐attention test
pyproject.toml Pinned triton<=3.1.0; excluded examples from linting
examples/test_sdpa_unpadded_attention_mask.py New test script to inspect attention masks
examples/test_embed_fa2.py Simple embedding example for flash-attention
examples/compare_flash_attention_embeddings.py Comprehensive script comparing embeddings across modes
Comments suppressed due to low confidence (4)

src/lobster/model/_ume.py:158

  • [nitpick] Configuration logic for padding and masks is duplicated between __init__ and load_from_checkpoint. Consider extracting this into a shared helper to adhere to DRY principles.
        if ckpt_path is not None:

src/lobster/model/_ume.py:791

  • Restricting device to exactly "cpu" or "cuda" excludes valid strings like "cuda:0" or custom devices. Consider accepting a torch.device object or allowing any string that starts with "cuda" for greater flexibility.
            if device not in ["cpu", "cuda"]:

src/lobster/model/_ume.py:752

  • The new load_from_checkpoint method isn’t covered by existing unit tests. Adding targeted tests for CPU vs GPU loading, verifying padding and use_sdpa_attn_mask settings, would help prevent regressions.
    def load_from_checkpoint(

src/lobster/model/README.md:3

  • [nitpick] The link reference to _ume.py may not resolve correctly in rendered markdown. Consider linking to the Ume class documentation or using a direct path to the file.
## [Universal Molecular Encoder](_ume.py)

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@ncfrey ncfrey temporarily deployed to test.pypi.org June 17, 2025 17:54 — with GitHub Actions Inactive
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's rename test_ to something else so it's not accidentally picked up by pytest

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@ncfrey ncfrey temporarily deployed to test.pypi.org June 17, 2025 17:56 — with GitHub Actions Inactive
@ncfrey ncfrey temporarily deployed to test.pypi.org June 17, 2025 18:00 — with GitHub Actions Inactive
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as here

"""
# Determine device
if device is not None:
if device not in ["cpu", "cuda"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

device could be something like "cuda:0" ?

@ncfrey ncfrey merged commit fd2fbe2 into main Jun 17, 2025
5 checks passed
@ncfrey ncfrey deleted the n/comparing-fa2 branch June 17, 2025 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants