Skip to content

Bitext Mining low scores #51

Description

@Muennighoff

I'm getting the below for

from mteb import MTEB
from mteb.abstasks.AbsTaskClustering import AbsTaskClustering
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("average_word_embeddings_komninos")
evaluation = MTEB(tasks=["BUCC"])
evaluation.run(model)
{
  "dataset_version": null,
  "mteb_version": "0.0.2",
  "test": {
    "de-en": {
      "accuracy": 0.0017745302713987473,
      "f1": 0.0017745302713987473,
      "precision": 0.0017745302713987473,
      "recall": 0.0017745302713987473
    },
    "evaluation_time": 456.59,
    "fr-en": {
      "accuracy": 0.0,
      "f1": 0.0,
      "precision": 0.0,
      "recall": 0.0
    },
    "ru-en": {
      "accuracy": 6.927606511950121e-05,
      "f1": 6.927606511950121e-05,
      "precision": 6.927606511950121e-05,
      "recall": 6.927606511950121e-05
    },
    "zh-en": {
      "accuracy": 0.0,
      "f1": 0.0,
      "precision": 0.0,
      "recall": 0.0
    }
  }
}

Seems too low - I think there's a bug

cc @NouamaneTazi @loicmagne

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions