Skip to content

Maeb merge main v2#3447

Merged
Samoed merged 217 commits into
maebfrom
maeb_merge_main_v2
Oct 22, 2025
Merged

Maeb merge main v2#3447
Samoed merged 217 commits into
maebfrom
maeb_merge_main_v2

Conversation

@Samoed

@Samoed Samoed commented Oct 20, 2025

Copy link
Copy Markdown
Member
  1. I've make all tasks and files to follow pep8
  2. Made torchadio as optional dependency
  3. Removed duplicates from v1

makram93 and others added 30 commits July 11, 2025 22:06
* feat: unify text and image embeddings for all tasks

* fix: uniform batch size

* fix: update error message

* fix: update code task

* fix: update max length

* fix: apply review suggestions
* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

---------

Co-authored-by: xinshuohu <xinshuohu@tencent.com>
Co-authored-by: Xinshuo Hu <yanshek.woo@gmail.com>
* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Modifications due to the comments

* Modifications due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* fix revisiions

* lint

* fix colnomic3b revision

* fix colqwen2.5 revision + latest repo version

* fix query agmentation tokens

* colsmol revision
Automatically generated by python-semantic-release
* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Modifications due to the comments

* Modifications due to the comments

* Adding STSEvaluator and SummarizationEvaluator tests

* Correcting due to the comments

* Correcting due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
* Classification dataset cleaning

* Update pull request number

* Fix metadata test

* fix formatting

* add script for cleaning
Add JapaneseSentimentClassification
* change document to passage

* fix prompt names

* fix kwargs check

* fix default prompt
Automatically generated by python-semantic-release
add opensearch inf-free models

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
* Add BareExamQA retrieval task

* ran linter

* updated details

* updated details

* fixed subtype name

* fixed changes

* ran linter again
specify revision for opensearch
Automatically generated by python-semantic-release
… been checked (#2940)

* fix: Only import SparseEncoder once sentence-transformer version have been checked

fixes #2936

* Update mteb/models/opensearch_neural_sparse_models.py

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

---------

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
…2939)

The leaderboard would have (silent) errors where `get_benchmark` lead to a KeyError due to "selector_state" being passed as a default value. Setting `DEFAULT_BENCMARK_NAME` as the value solves this issue.
* docs: Update adding_a_dataset.md

* Update docs/adding_a_dataset.md
Automatically generated by python-semantic-release
* BSARD loader fixed

* BSARDv2 metadata fixed

* Update mteb/tasks/Retrieval/fra/BSARDRetrieval.py

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
* Added govreport task

* Updated description
* Added BillSum datasets

* fixed billsumca

* Updated BillSumCA description

* Updated BillSumUS description

* Update mteb/tasks/Retrieval/eng/BillSumCA.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update mteb/tasks/Retrieval/eng/BillSumUS.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* lint

* lint

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
…2716)

* Add RuSciBench

* fix bitext mining lang

* Add regression task

* fix init

* add missing files

* Improve description

* Add superseded_by

* fix lint

* Update regression task to match with v2

* Add stratified_subsampling for regression task

* Add boostrap for regression task

* Rename task class, add model as evaluator argument

* fix import

* fix import 2

* fixes

* fix

* Rename regression model protocol
@KennethEnevoldsen

Copy link
Copy Markdown
Contributor

Do you want to do all of this in one go? I would probably just transfer one task type over at a time

@Samoed

Samoed commented Oct 20, 2025

Copy link
Copy Markdown
Member Author

I just want to make basic merge firstly to make tests runnable. After this, I will update per task type

@Samoed Samoed changed the base branch from maeb_v2 to maeb October 20, 2025 19:14
@Samoed Samoed marked this pull request as ready for review October 21, 2025 11:05

@KennethEnevoldsen KennethEnevoldsen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impossible to review. Gotta trust you here

@Samoed

Samoed commented Oct 21, 2025

Copy link
Copy Markdown
Member Author

You can view last 11 commits

@KennethEnevoldsen

Copy link
Copy Markdown
Contributor

Ahh yea that was a good idea. Yeah changes looks reasonable!

@Samoed Samoed merged commit 9f1c7a6 into maeb Oct 22, 2025
10 checks passed
@Samoed Samoed deleted the maeb_merge_main_v2 branch October 22, 2025 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.