MMSeqs2 DB slimmer

Hi
this is not an issue but a potential enhancement we discussed with @martin-steinegger.
We have a seed clustering database that is continuously updated with new sequences. The size of the DB is growing quite fast, and eventually, we will have problems storing and distributing it. As we have many redundant sequences in each cluster. We thought that having a module that takes a DB and then filters it based on a criterion similar to `--diff` from `result2msa` or `result2profile` would be very useful to keep only informative sequences in the clusters.

Thanks
Antonio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MMSeqs2 DB slimmer #316

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MMSeqs2 DB slimmer #316

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions