add MIEB results and rename model to pass tests#122
Conversation
|
When pointing embeddings-benchmark/mteb#2035 to this branch, it seems like MIEB results cannot be displayed due to "Number of parameters". |
|
@gowitheflow-1998 @KennethEnevoldsen here's a screenshot of the LB, hacked to point to this branch. eng and lite versions were able to render as well. Cache needed to be wiped. |
|
there's a few task where the main metric was wrong when we implemented them and isn't matching with the paper. Let me double-check all tasks and get back. Might be a good idea to replace the scores in main metric with the actual main metrics before we merge I think |
|
Also seems like the performance v. model size plot need some model references. You can add these in:
|
Figured it out 👍 [update] Added a few models that ranked first from a few task types:
|
|
The performance per task type plot isn't showing though 🤔 says it only contains one task type when there are 8. |
|
hmm not sure why this is happening - @x-tabdeveloping do you have an idea? |
|
I'll have a look at it tomorrow |
|
@isaac-chung My guess would be it's cause of task_types = [
"BitextMining",
"Classification",
"MultilabelClassification",
"Clustering",
"PairClassification",
"Reranking",
"Retrieval",
"STS",
"Summarization",
# "InstructionRetrieval",
# Not displayed, because the scores are negative,
# doesn't work well with the radar chart.
"Speed",
]The reason I made this list was because instruction retrieval shows scores in the negatives and that doesn't really work with the radar chart. |
That's it. Thanks! It's working now. |
|
have fixed main metric issue by overwriting main scores with actual main metric scores; deleted previous incomplete Jina runs with a old version that only has a few task results. overwritten scores include: |
|
@gowitheflow-1998 good stuff! Are we ready to merge? |
yeah, merged! adding @Muennighoff as co-author for running most of the results here! |
|
Does this have everything from https://github.com/embeddings-benchmark/tmp i.e. we can safely delete that repo? |
yeah! all results are here |

Fixes embeddings-benchmark/mteb#1823
Add MIEB results. The following models have been renamed to add org name (based on local test failures):
QuanSun: https://huggingface.co/QuanSun/EVA-CLIP/tree/mainvoyageaiRelated MTEB issue: embeddings-benchmark/mteb#2074
Checklist
make test.make pre-push.Adding a model checklist
mteb/models/directory. Instruction to add a model can be found here in the following PR ____