Expand Encoder-Classes in XTransformer to include ModernBERT

Dear Pecos-Team,

I am working as data scientiest for the german national library. We have testet pecos/xtransformer for our own multi label classification tasks and it promises a significant improvement over the existing state-of-the art. 
Our production environemnt is based around a toolkit called [annif](https://github.com/NatLibFi/Annif), which will soon integrate pecos/xtransformer thanks to the [work of other collaborators](https://github.com/NatLibFi/Annif/pull/798). 
We have observed that there have been developments to refit the BERT-Generation of Encoder models, such as ModernBERT or EuroBERT (see below). These promise compatibility with previous BERT-Applications, while improving speed and context length. 

Currently, the allowed encoder-classes for xtransformer are hard-coded in the [network.py](https://github.com/amzn/pecos/blob/mainline/pecos/xmc/xtransformer/network.py) file. 

What would be the necessary steps to make XTransformer work for this new model generation, too? 
Can you give us directions, so that we can propose a pull request to accomplish this?

Best,
Maximilian

## References

- ModernBERT: https://huggingface.co/answerdotai/ModernBERT-base
- EuroBERT: https://huggingface.co/EuroBERT/EuroBERT-2.1B
- Subject Indexing Toolkit Annif: https://github.com/NatLibFi/Annif
- Pull Request x-transformer to annif: https://github.com/NatLibFi/Annif/pull/798


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand Encoder-Classes in XTransformer to include ModernBERT #307

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expand Encoder-Classes in XTransformer to include ModernBERT #307

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions