Update README.md by ncfrey · Pull Request #69 · prescient-design/lobster

ncfrey · 2025-05-02T18:35:33Z

No description provided.

Copilot

Pull Request Overview

This PR updates the contributor section in the README.md to include additional contributors.

Updated contributor list in the README.md file
Revised GitHub links for new contributors

Comments suppressed due to low confidence (1)

README.md:5

[nitpick] Ensure that the order of the contributor names follows the project’s established guidelines, if applicable, to maintain consistency across the documentation.

`lobster` is a "batteries included" language model library for proteins and other biological sequences. Led by [Nathan Frey](https://github.com/ncfrey), [Karina Zadorozhny](https://github.com/karinazad), [Taylor Joren](https://github.com/taylormjs), [Sidney Lisanza](https://github.com/Sidney-Lisanza), [Aya Abdlesalam Ismail](https://github.com/ayaabdelsalam91), [Joseph Kleinhenz](https://github.com/kleinhenz) and [Allen Goodman](https://github.com/0x00b1), with many valuable contributions from [Contributors](docs/CONTRIBUTORS.md) across [Prescient Design, Genentech](https://www.gene.com/scientists/our-scientists/prescient-design).

* peer fixes, add evaluate method * dataloader checkpoint callback (#60) * dataloader callback * utils * ume * gitignore dev * tests * update flash attention wheels (#61) * lock * torch 2.5 * torch 2.5 * part * .env * unpin flash attn (#62) * fix scheduler params (#64) * scheduler * fix scheduler * fix scheduler * Add AtomicaDataset (#63) Processed Atomica interactions dataset * Ume conversion/interaction tokenizer + fix SMILES and nucleotide tokenizers (#65) add two special tokens: <convert> and <interact> for later stages of Ume training: will be used as this: (or something like that) [CLS] PROT_SEQ [SEP] <convert> PROT_STRUCT(masked) [SEP] [CLS] PROT_SEQ [SEP] <interact> SMILES(masked) [SEP] extend functionality of UmeTokenizerTransform to handle dual modalities change the name of Ume embedding method and allow embedding from existing input_ids fix existing tokenizers: add lowercase normalized to nucleotide tokenizer (OG2 dataset contains a mix of upper and lowercase letters) BPE handled SMILES tokenization incorrectly, switch to WordLevel * Ume SMILES tokenizer fix (#66) * tokenizer * fix tests * lowercase normalizer for nt * tests * remove mod conv dataset * embed * Test * merge 2mod into UmeTokenizerTransform * fix tests * all * type hints * docstrings * tests * fix SMILES tokenizer * switch all tokenizer to BPE * Revert "switch all tokenizer to BPE" This reverts commit 367e77d. * tok * fix SMILES tokenizer * remove print statement * Ume perplexity logging (#67) * pplx * tests * src * ignore torchmetrics warnings * docstrings * docstrings * Update README.md (#69) * Ume fix perplexity device (#68) * pplx as attr * pplx as attr * pplx * comments * on step * comment * update tests, fix ruff * ruff * ruff ruff * Add <cls_modality> to Ume tokenizers (#71) * add <cls_modality> tokens * add <cls_modality> tokens * docstring * RNS metric implementation (#73) * add <cls_modality> tokens * add <cls_modality> tokens * modality embeddings * module dict * embeddings * tests * modality and device * rank zero only * rank zero * fix back modality mask * sync dist * RNS implementation * restore from main * restore * docstrings * docstrings * review * test * Ume modality-specific embeddings (#72) * add <cls_modality> tokens * add <cls_modality> tokens * modality embeddings * module dict * embeddings * tests * modality and device * rank zero only * rank zero * fix back modality mask * sync dist * add conversion transforms (#74) * add initial smiles to peptide and peptide to smiles transforms * remove smiles -> * transforms and touch up conversion functions * rename * add option to randomize smiles and caps --------- Co-authored-by: Colin Grambow <grambowc@gene.com> * fix def pad token, replace process_and_embed w/ ume.embed * update tests w -100 pad token --------- Co-authored-by: Taylor Joren <joren.taylor@gene.com> Co-authored-by: Karina Zadorozhny <karina.zadorozhny@gmail.com> Co-authored-by: Nathan Frey <ncfrey@users.noreply.github.com> Co-authored-by: Colin Grambow <17198155+cgrambow@users.noreply.github.com> Co-authored-by: Colin Grambow <grambowc@gene.com>

Update README.md

5619aa2

ncfrey enabled auto-merge (squash) May 2, 2025 18:35

ncfrey requested a review from Copilot May 2, 2025 18:35

ncfrey temporarily deployed to test.pypi.org May 2, 2025 18:35 — with GitHub Actions Inactive

Copilot AI reviewed May 2, 2025

View reviewed changes

karinazad approved these changes May 2, 2025

View reviewed changes

ncfrey merged commit 5165de3 into main May 2, 2025
5 checks passed

ncfrey deleted the ncfrey-patch-1 branch May 2, 2025 18:39

taylormjs pushed a commit that referenced this pull request May 6, 2025

Update README.md (#69)

ec7287f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README.md#69

Update README.md#69
ncfrey merged 1 commit intomainfrom
ncfrey-patch-1

ncfrey commented May 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ncfrey commented May 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants