Merge main 05 10#3246
Merged
Merged
Conversation
* model: Add BMRetriever * Update mteb/models/bmretriever_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * Update mteb/models/bmretriever_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * fix: remove trust_remote_code option * feat: implement BMREtrieverWrapper based on InstructSentenceTransformerWrapper * refactor: update training datasets for bmretriever --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
* add codefuse models * add codefuse models * Update codefuse_models.py * lint codefuse.py
* Adding Cohere's output_dimension and embedding_type parameter Cohere's embed-v4 binary and int8 * Correcting due to comments
* feat: add swedish cpc patent classifications to mteb * fix: formatting and init imports * fix: update mteb task according to feedback * fix: perform citation and code formatting * fix: add train and test split for both datasets
* fix: delete kwargs for similarity score in ColPaliEngineWrapper for method behavior * chore: fix colpali_models similarity handle device
* fix(models): prevent EOS token truncation for BMRetriever * refactor(models): refactor tokenizer setup in `InstructSentenceTransformerWrapper` * fix(models): correct eos token handling in `BMRetrieverWrapper`
* update giga embeddings * update giga embeddings * 3b-september-2025 * fixed * lint * Update mteb/models/ru_sentence_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * change revision due to flash-attn dependency * change apply_instruction_to_passages --------- Co-authored-by: Kolodin Egor <eikolodin@sberbank.ru> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Неизвестный Пользователь722497 <dolegosmirnov@sberbank.ru>
* feat - Split create_tables into static Benchmark methods * feat - format * Update mteb/leaderboard/table.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * feat - remove search query;take benchmark result as input;addressing the circular import, * feat - format * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * feat - use to_dataframe;clean table.py;move creat_table * feat - fix circular import * feat - clean-up * feat - format --------- Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Adding another voyageai model
* Update qzhou_models.py * Update qzhou_models.py * reformat script code * Update configuration * According to our new decision, the model name has been changed to "QZhou-Embedding-Zh". * Fix variable naming issues.
* add youtu models * add a blank line * fix the optional dependencies and lint the code * remove unused dependencies and reformat * revise prompt_type * update youtu_models --------- Co-authored-by: springxchen <springxchen@tencent.com>
* add software issue localization datasets * add software issue localization datasets * update and add multilingual datasets * fix citation format issues * Update mteb/tasks/Reranking/eng/SWEbenchVerifiedReranking.py * fix linting issues --------- Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
* feat - adjust Rteb's Benchmark * feat - add blank * fix menu names * Update mteb/leaderboard/benchmark_selector.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * moving around tasks * fix: Update RTEB summary columns (#3226) * fix(models): ensure prompt_type is passed to format_instruction (#3216) * 1.38.58 Automatically generated by python-semantic-release * Adding Cohere's output_dimension and embedding_type parameter (#3204) * Adding Cohere's output_dimension and embedding_type parameter Cohere's embed-v4 binary and int8 * Correcting due to comments * dataset: add swedish cpc patent classifications to mteb (#3072) * feat: add swedish cpc patent classifications to mteb * fix: formatting and init imports * fix: update mteb task according to feedback * fix: perform citation and code formatting * fix: add train and test split for both datasets * fix: AttributeError in ColPaliEngineWrapper similarity method (#3177) * fix: delete kwargs for similarity score in ColPaliEngineWrapper for method behavior * chore: fix colpali_models similarity handle device * Update tasks & benchmarks tables * 1.38.59 Automatically generated by python-semantic-release * fix: prevent EOS token truncation (#3218) * fix(models): prevent EOS token truncation for BMRetriever * refactor(models): refactor tokenizer setup in `InstructSentenceTransformerWrapper` * fix(models): correct eos token handling in `BMRetrieverWrapper` * 1.38.60 Automatically generated by python-semantic-release * Update giga embeddings (#3210) * update giga embeddings * update giga embeddings * 3b-september-2025 * fixed * lint * Update mteb/models/ru_sentence_models.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> * change revision due to flash-attn dependency * change apply_instruction_to_passages --------- Co-authored-by: Kolodin Egor <eikolodin@sberbank.ru> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Неизвестный Пользователь722497 <dolegosmirnov@sberbank.ru> * fix: Refactor split create_tables into static Benchmark methods (#3126) * feat - Split create_tables into static Benchmark methods * feat - format * Update mteb/leaderboard/table.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * feat - remove search query;take benchmark result as input;addressing the circular import, * feat - format * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * Update mteb/benchmarks/benchmark.py Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * feat - use to_dataframe;clean table.py;move creat_table * feat - fix circular import * feat - clean-up * feat - format --------- Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> * 1.38.61 Automatically generated by python-semantic-release * Extending the RTEB benchmark (#3223) Adding another voyageai model * Update tasks & benchmarks tables * feat - filter_by_privacy * feat - add new fields for rteb part * feat - getattr * feat - adjust privacy filter logic * feat - enhance summary table column renaming and add 'is_public' field mapping * fix: remove unused 'is_public' attribute from TaskResult --------- Co-authored-by: Yongbin Choi <whybe.choi@gmail.com> Co-authored-by: semantic-release <semantic-release> Co-authored-by: fzoll <5575946+fzoll@users.noreply.github.com> Co-authored-by: Atheer <atheer2104@protonmail.com> Co-authored-by: Yong woo Song <ywsong.dev@kakao.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Egor <31567312+ekolodin@users.noreply.github.com> Co-authored-by: Kolodin Egor <eikolodin@sberbank.ru> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: Неизвестный Пользователь722497 <dolegosmirnov@sberbank.ru> Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> Co-authored-by: smile <smile@pinai.io> Co-authored-by: ethan <smiletoye@gmail.com> * removed show_rteb args * avoid defining function where we can just use the metadata * minor fixes * minor fixes * fix: Correct logic for filtering public tasks in ModelResult class (#3230) Co-authored-by: ethan <smiletoye@gmail.com> --------- Co-authored-by: q275343119 <275343119@qq.com> Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> Co-authored-by: 笑尿伊人 <44760272+q275343119@users.noreply.github.com> Co-authored-by: Yongbin Choi <whybe.choi@gmail.com> Co-authored-by: fzoll <5575946+fzoll@users.noreply.github.com> Co-authored-by: Atheer <atheer2104@protonmail.com> Co-authored-by: Yong woo Song <ywsong.dev@kakao.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Egor <31567312+ekolodin@users.noreply.github.com> Co-authored-by: Kolodin Egor <eikolodin@sberbank.ru> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> Co-authored-by: Неизвестный Пользователь722497 <dolegosmirnov@sberbank.ru> Co-authored-by: smile <smile@pinai.io> Co-authored-by: ethan <smiletoye@gmail.com>
* fix: Add rteb submission references and improve descriptions. * Added evaluation request * added field for tasks
* Human Subsets Tasks * Fixed Multilingual Classification Subset * linting * fix citations format * make lint * fix tests * remove human folder * fix relative imports * add adapted_from for all human subsets * fix pydantic errors * add benchmark object * make benchmark discoverable * bibtex test * Apply suggestion Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * Apply suggestions from code review Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * rename & reupload * upd tests * upd tests again * add model * add benchmark to leaderboard * change branch of leaderboard * remove branch of load data * fix model meta path * make mteb importable * update repo * Update mteb/benchmarks/benchmarks/benchmarks.py * Update mteb/leaderboard/benchmark_selector.py * Update mteb/load_results/load_results.py Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> --------- Co-authored-by: Adnan El Assadi <aassadi22@ku.edu.tr> Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> Co-authored-by: AdnanElAssadi56 <115242814+AdnanElAssadi56@users.noreply.github.com>
* Remove 'HUME(v1)' from leaderboard benchmark * lint
* update adding_a_benchmark.md documentation * fix numbers
* fix: Further specified macro-language code for Norwegian "nor" is a macro-language code that covers bokmål and nynorsk (both norwegian), but this means that these datasets will be missed if using "nob" or "nno". Specifying it like this should allow this. * furhter specified macro language "nor"
# Conflicts: # docs/benchmarks.md # mteb/benchmarks/benchmark.py # mteb/benchmarks/benchmarks/__init__.py # mteb/benchmarks/benchmarks/benchmarks.py # mteb/evaluation/evaluators/RerankingEvaluator.py # mteb/leaderboard/benchmark_selector.py # mteb/leaderboard/table.py # mteb/load_results.py # mteb/models/abs_encoder.py # mteb/models/instruct_wrapper.py # mteb/models/model_implementations/cohere_models.py # mteb/models/model_implementations/cohere_v.py # mteb/models/model_implementations/ru_sentence_models.py # mteb/models/model_implementations/youtu_models.py # mteb/models/overview.py # mteb/results/benchmark_results.py # mteb/tasks/Classification/__init__.py # mteb/tasks/Clustering/__init__.py # mteb/tasks/MultiLabelClassification/__init__.py # mteb/tasks/Reranking/__init__.py # mteb/tasks/Retrieval/multilingual/MKQARetrieval.py # mteb/tasks/STS/__init__.py # scripts/make_leaderboard.py
7 tasks
* fix python39 transformers * fix
aggregate by subset for HUMEv1
Fix AbsTaskTextRegression
* feat - add Japanese * feat - use mteb.get_benchmark * fix - 3.9 test error * Revert "fix - 3.9 test error" This reverts commit 6bfee53. * fix - 3.9 test error
# Conflicts: # mteb/benchmarks/benchmarks/__init__.py # mteb/benchmarks/benchmarks/benchmarks.py # mteb/models/bm25.py
ec748ef to
6e2766d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If you add a model or a dataset, please add the corresponding checklist: