fix: Allow model to output torch.tensor by KennethEnevoldsen · Pull Request #2234 · embeddings-benchmark/mteb

KennethEnevoldsen · 2025-03-04T09:57:06Z

Adressed aspects of #941 by allowing model.encode to return a torch.tensor.

Also does a lot of fixes to array typing.

further problems

Unsqueeze not in array API spec, we should use expand_dims, but torch does not implement this (Support expand_dims pytorch/pytorch#56774). We can use reshape instead.
What to do with torch.functional? We would have to reimplent these to make it work.

We have a few cases of torch.compile. Which are called every time the function is called. They seem to have been added by @orionw, do we expect that these speed this up (they are compiled every time the function is called)

Code Quality

Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

[ NA ] New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

@orionw

Adressed aspects of #941 # further problems 1) Unsqueeze not in array API spec, we should use expand_dims, but torch does not implement this (pytorch/pytorch#56774). We can use reshape instead. 2) What to do with torch.functional? We would have to reimplent these to make it work. We have a few cases of torch.compile. Which are called every time the function is called. They seem to have been added by @orionw, do we expect that these speed this up (they are compiled every time the function is called)

KennethEnevoldsen · 2025-03-04T12:06:32Z

@orionw would love your opinion here on the compiled functions. This would, e.g., allow us to e.g. replace

torch.nn.functional.normalize -> sklearn.preprocessing.normalize (~eq.)

which allow for both torch tensor and numpy arrays

orionw · 2025-03-04T13:02:55Z

I am not sure why they are compiled every time, they should be compiled once. I added it because on tasks with lots to normalize it was about a 10x speedup.

We can remove it for simplicity, but is is much much faster.

KennethEnevoldsen · 2025-03-04T16:12:47Z

Hmm, that is good to know. I will keep it, at least for now

Samoed reviewed Mar 4, 2025

View reviewed changes

Comment thread mteb/evaluation/evaluators/ClassificationEvaluator.py

Samoed reviewed Mar 4, 2025

View reviewed changes

Comment thread mteb/encoder_interface.py

KennethEnevoldsen added 2 commits March 4, 2025 12:21

remove kNN-Pytorch as it is supported by scikit-learn>=1.4.0

1b89819

fixed missing imports

166384c

cleanup

fe6f4d9

KennethEnevoldsen merged commit 0fb363b into v2.0.0 Mar 4, 2025

KennethEnevoldsen deleted the array-spec-v2 branch March 4, 2025 16:45

KennethEnevoldsen mentioned this pull request May 4, 2025

Standardising matrix arrays (adopt Array API spec) #941

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Allow model to output torch.tensor#2234

fix: Allow model to output torch.tensor#2234
KennethEnevoldsen merged 4 commits into
v2.0.0from
array-spec-v2

KennethEnevoldsen commented Mar 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

KennethEnevoldsen commented Mar 4, 2025

Uh oh!

orionw commented Mar 4, 2025

Uh oh!

KennethEnevoldsen commented Mar 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

KennethEnevoldsen commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

further problems

Code Quality

Documentation

Testing

Uh oh!

Uh oh!

Uh oh!

KennethEnevoldsen commented Mar 4, 2025

Uh oh!

orionw commented Mar 4, 2025

Uh oh!

KennethEnevoldsen commented Mar 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

KennethEnevoldsen commented Mar 4, 2025 •

edited

Loading