Returning Evaluation results by AmrMKayid · Pull Request #26 · embeddings-benchmark/mteb

AmrMKayid · 2022-07-20T12:54:00Z

Return the evaluation results dictionary if return_resutls is set to True

Co-authored-by: holidaydrien <adrien.morisot@gmail.com>

NouamaneTazi · 2022-08-01T22:38:59Z

Apologies for the delay. Thank you for the contribution! I think many users would find this handy, would you like to add a small example of how to use the flag in the README?

Muennighoff · 2022-08-03T17:49:16Z

It doesn't hurt always returning them, so I would remove the kwarg & always return them; What do you think @NouamaneTazi ?

NouamaneTazi · 2022-08-03T22:33:37Z

True, I don't see a problem with that neither. As long as we update the related docs.

Muennighoff · 2022-08-04T07:36:09Z

@AmrMKayid If you could make it the default & update the docstring of run that would be amazing!

AmrMKayid · 2022-08-05T21:05:58Z

@NouamaneTazi @Muennighoff Thank you very much for the feedback! I have made it the default and updated the docs :))

NouamaneTazi · 2022-08-05T23:24:55Z

Amazing! Thanks for the clean PR :)

* add Masakhane dataset config * add trigram lang code for dataset who use it * create french script eval * fix French word * add some documentation * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * 4 pair classification (#10) * add Opusparcus dataset * multilingual usage * use eval_split of config files * change eval_split according to data --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * Clustering with HAL S2S dataset (#11) HAL S2S dataset creation and evaluation on clustering task. * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * DiaBLa and Flores Bitext Mining evaluation (#12) * Add DiaBLa dataset for bitext mining * Add DiaBLa dataset for bitext mining * deduplicate bitext task * add Flores * format files * add flores to evaluation script * remove prints * add revision --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * adding dataset processing for mteb * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * fix change on langmapping * reset alphabetical order * add revision handling * Clustering: Add AlloProf dataset (#17) AlloProf dataset for clustering task * handling of revision * change split + add revision handling * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * adding dataset processing for mteb * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * add script to process and upload alloprof on HF * adding dataset processing for mteb * refactor few thing * reset alphabetical order * add revision handling * handling of revision * change split + add revision handling * use eval variable * alphabetic order * Add MLSUM dataset for clustering task (#21) * Use Masakhane dataset for clustering task (#23) * 16 add datasets to readmemd (#18) * run task table * run task table * Add MLSUM dataset for clustering task (#21) * Use Masakhane dataset for clustering task (#23) * run task table * refresh readme * refresh readme * run task table * refresh readme --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> Co-authored-by: Marion Schaeffer <92590517+schmarion@users.noreply.github.com> * load only test split (#25) Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * Update mteb/tasks/BitextMining/DiaBLaBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/HALClusteringS2S.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * renaming masakhane (#28) Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * Syntec dataset addition (#26) * add scrpit to process & load to HF * add script to enable download of data from HF * add syntec dataset files to gitignore * add syntecretrieval * add syntec retrival * build dataloading script * remove datasets * correct typo --------- Co-authored-by: Sequeira Gabriel <gabriel.sequeira@outlook.fr> * 30 add syntec reranking (#31) * change name to secify retrieval * add reranking tasks * create script to upload dataset fo reranking task * create reranking task * add reranking tasks * add model name in description * SummEval translated to french (#32) * 7 sts (#33) * taike into account multilingual tasks * add stsbenchmark multilingual dataset * add STS tasks * taike into account multilingual tasks * add stsbenchmark multilingual dataset * add STS tasks * add coma * Adding sick fr dataset to sts tasks (#34) * Adding sick fr dataset to sts tasks * modifying dataset in load function to have the right column names * Fix alloprof dataset (#36) * change revision to use * remove duplicate data * change main metric because dataset is hard (#37) * Fix alloprof dataset (#40) * change revision to use * remove duplicate data * change revision * handle queries train test split * change dataset creation method * change revision * handle queries train test split * change dataset creation method * Fix DiaBLa by inheriting CrossLingual class (#42) * Fix DiaBLa by inheriting CrossLingual class * remove remaining print * Fix DiaBLa integration * Update mteb/tasks/BitextMining/FloresBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Classification/MasakhaNEWSClassification.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md * Update mteb/tasks/BitextMining/FloresBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/abstasks/AbsTaskPairClassification.py Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com> * Update README.md * Update scripts/data/syntec/create_data_reranking.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/data/alloprof/create_data_reranking.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/run_mteb_french.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/run_mteb_french.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Retrieval/HagridRetrieval.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MLSUMClusteringP2P.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MLSUMClusteringS2S.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py * Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py * Update mteb/tasks/STS/SickFrSTS.py * Inherit OpusparcusPC init from MultilingualTask * remove unnecessary init * Remove train split from evaluation on MasakhaNEWSClassification (#52) remove train split from evaluation * put script on HF dataset repos (#56) * put script on HF dataset repos * remove scripts * 49 fix dictionnary in syntecretrieval (#54) * add trust remote code arg * leave corpus as dict * remove trust remote code * add Tatoeba & BUCC BitextMining tasks (#57) add bucc and tatoeba bitextmining tasks * 46 add other languages to masakhaneweclusterings2s and p2p (#58) * add other language to clustering tasks * fix main score and S2S task * update run fr becnhmark script * Update run_mteb_french.py * Update AbsTaskClustering.py * remove train and validation splits --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> Co-authored-by: Marion Schaeffer <92590517+schmarion@users.noreply.github.com> Co-authored-by: mciancone@openstudio.fr <mciancone@openstudio.fr> Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com> Co-authored-by: mciancone <73994289+Sunalwing@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: wissam-sib <36303760+wissam-sib@users.noreply.github.com> Co-authored-by: Wissam Siblini <wissam.siblini92@gmail.com>

* add Masakhane dataset config * add trigram lang code for dataset who use it * create french script eval * fix French word * add some documentation * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * 4 pair classification (#10) * add Opusparcus dataset * multilingual usage * use eval_split of config files * change eval_split according to data --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * Clustering with HAL S2S dataset (#11) HAL S2S dataset creation and evaluation on clustering task. * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * DiaBLa and Flores Bitext Mining evaluation (#12) * Add DiaBLa dataset for bitext mining * Add DiaBLa dataset for bitext mining * deduplicate bitext task * add Flores * format files * add flores to evaluation script * remove prints * add revision --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * adding dataset processing for mteb * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * fix change on langmapping * reset alphabetical order * add revision handling * Clustering: Add AlloProf dataset (#17) AlloProf dataset for clustering task * handling of revision * change split + add revision handling * add script to process and upload alloprof on HF * build script for HF * adding dataset processing for mteb * refactor few thing * remove whitespaces * adding dataset processing for mteb * adding BSARD dataset * add BSARD to benchmark * adding Hagrid dataset * add script to process and upload alloprof on HF * adding dataset processing for mteb * refactor few thing * reset alphabetical order * add revision handling * handling of revision * change split + add revision handling * use eval variable * alphabetic order * Add MLSUM dataset for clustering task (#21) * Use Masakhane dataset for clustering task (#23) * 16 add datasets to readmemd (#18) * run task table * run task table * Add MLSUM dataset for clustering task (#21) * Use Masakhane dataset for clustering task (#23) * run task table * refresh readme * refresh readme * run task table * refresh readme --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> Co-authored-by: Marion Schaeffer <92590517+schmarion@users.noreply.github.com> * load only test split (#25) Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * Update mteb/tasks/BitextMining/DiaBLaBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/HALClusteringS2S.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * renaming masakhane (#28) Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> * Syntec dataset addition (#26) * add scrpit to process & load to HF * add script to enable download of data from HF * add syntec dataset files to gitignore * add syntecretrieval * add syntec retrival * build dataloading script * remove datasets * correct typo --------- Co-authored-by: Sequeira Gabriel <gabriel.sequeira@outlook.fr> * 30 add syntec reranking (#31) * change name to secify retrieval * add reranking tasks * create script to upload dataset fo reranking task * create reranking task * add reranking tasks * add model name in description * SummEval translated to french (#32) * 7 sts (#33) * taike into account multilingual tasks * add stsbenchmark multilingual dataset * add STS tasks * taike into account multilingual tasks * add stsbenchmark multilingual dataset * add STS tasks * add coma * Adding sick fr dataset to sts tasks (#34) * Adding sick fr dataset to sts tasks * modifying dataset in load function to have the right column names * Fix alloprof dataset (#36) * change revision to use * remove duplicate data * change main metric because dataset is hard (#37) * Fix alloprof dataset (#40) * change revision to use * remove duplicate data * change revision * handle queries train test split * change dataset creation method * change revision * handle queries train test split * change dataset creation method * Fix DiaBLa by inheriting CrossLingual class (#42) * Fix DiaBLa by inheriting CrossLingual class * remove remaining print * Fix DiaBLa integration * Update mteb/tasks/BitextMining/FloresBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Classification/MasakhaNEWSClassification.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update README.md * Update mteb/tasks/BitextMining/FloresBitextMining.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/abstasks/AbsTaskPairClassification.py Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com> * Update README.md * Update scripts/data/syntec/create_data_reranking.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/data/alloprof/create_data_reranking.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/run_mteb_french.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update scripts/run_mteb_french.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/evaluation/MTEB.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Retrieval/HagridRetrieval.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MLSUMClusteringP2P.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MLSUMClusteringS2S.py Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> * Update mteb/tasks/Clustering/MasakhaNEWSClusteringP2P.py * Update mteb/tasks/Clustering/MasakhaNEWSClusteringS2S.py * Update mteb/tasks/STS/SickFrSTS.py * Inherit OpusparcusPC init from MultilingualTask * remove unnecessary init * Remove train split from evaluation on MasakhaNEWSClassification (#52) remove train split from evaluation * put script on HF dataset repos (#56) * put script on HF dataset repos * remove scripts * 49 fix dictionnary in syntecretrieval (#54) * add trust remote code arg * leave corpus as dict * remove trust remote code * add Tatoeba & BUCC BitextMining tasks (#57) add bucc and tatoeba bitextmining tasks * 46 add other languages to masakhaneweclusterings2s and p2p (#58) * add other language to clustering tasks * fix main score and S2S task * update run fr becnhmark script * Update run_mteb_french.py * Update AbsTaskClustering.py * remove train and validation splits * remove Hagrid (#60) --------- Co-authored-by: Gabriel Sequeira <gsequeira@openstudio.fr> Co-authored-by: Marion Schaeffer <92590517+schmarion@users.noreply.github.com> Co-authored-by: mciancone@openstudio.fr <mciancone@openstudio.fr> Co-authored-by: Sequeira Gabriel <gabriel.sequeira@outlook.fr> Co-authored-by: Imene Kerboua <33312980+imenelydiaker@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: wissam-sib <36303760+wissam-sib@users.noreply.github.com> Co-authored-by: Wissam Siblini <wissam.siblini92@gmail.com>

Returning Evaluation results

3d60490

amorisot reviewed Jul 20, 2022

View reviewed changes

Comment thread mteb/evaluation/MTEB.py Outdated

amorisot reviewed Jul 20, 2022

View reviewed changes

Comment thread mteb/evaluation/MTEB.py Outdated

AmrMKayid and others added 3 commits July 20, 2022 14:02

Update mteb/evaluation/MTEB.py

c4acb76

Co-authored-by: holidaydrien <adrien.morisot@gmail.com>

Update mteb/evaluation/MTEB.py

a4d952b

Co-authored-by: holidaydrien <adrien.morisot@gmail.com>

Merge branch 'main' into return-results

314e5d7

Update docs

dd4a1f2

NouamaneTazi approved these changes Aug 5, 2022

View reviewed changes

NouamaneTazi merged commit 8f3242c into embeddings-benchmark:main Aug 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Returning Evaluation results#26

Returning Evaluation results#26
NouamaneTazi merged 5 commits into
embeddings-benchmark:mainfrom
AmrMKayid:return-results

AmrMKayid commented Jul 20, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

NouamaneTazi commented Aug 1, 2022

Uh oh!

Muennighoff commented Aug 3, 2022 •

edited

Loading

Uh oh!

NouamaneTazi commented Aug 3, 2022

Uh oh!

Muennighoff commented Aug 4, 2022

Uh oh!

AmrMKayid commented Aug 5, 2022

Uh oh!

NouamaneTazi commented Aug 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

AmrMKayid commented Jul 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NouamaneTazi commented Aug 1, 2022

Uh oh!

Muennighoff commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NouamaneTazi commented Aug 3, 2022

Uh oh!

Muennighoff commented Aug 4, 2022

Uh oh!

AmrMKayid commented Aug 5, 2022

Uh oh!

NouamaneTazi commented Aug 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

AmrMKayid commented Jul 20, 2022 •

edited

Loading

Muennighoff commented Aug 3, 2022 •

edited

Loading