seqme#
seqme is a modular and extendable python library containing model-agnostic metrics for evaluating biological sequence designs. It enables benchmarking and comparison of generative models for small molecules, DNA, RNA, peptides, and proteins.
Key features:
Metrics: A collection of sequence-, embedding-, and property-based metrics for evaluating generative models designs.
Models: Out-of-the-box, pre-trained property and embedding models for small molecules, DNA, RNA, peptides, and proteins.
Visualizations: Functionality to display metric results from single-shot and iterative optimization methods as tables and plots.
Is a metric or model missing? seqme’s modular metric and third-party model interfaces make adding your own easy.
Quick start#
Install seqme and the protein language model, ESM-2.
pip install "seqme[esm2]"
Run in a Jupyter notebook:
import seqme as sm
sequences = {
"Random": ["MKQW", "RKSPL"],
"UniProt": ["KKWQ", "RKSPL", "RASD"],
"HydrAMP": ["MMRK", "RKSPL", "RRLSK", "RRLSK"],
}
cache = sm.Cache(
models={"esm2": sm.models.ESM2(
model_name="facebook/esm2_t6_8M_UR50D", batch_size=256, device="cpu")
}
)
metrics = [
sm.metrics.Uniqueness(),
sm.metrics.Novelty(reference=sequences["UniProt"]),
sm.metrics.FBD(reference=sequences["Random"], embedder=cache.model("esm2")),
]
df = sm.evaluate(sequences, metrics)
sm.show(df) # Note: Will only display the table in a notebook.