Skip to content

fix: Allow model to output torch.tensor#2234

Merged
KennethEnevoldsen merged 4 commits into
v2.0.0from
array-spec-v2
Mar 4, 2025
Merged

fix: Allow model to output torch.tensor#2234
KennethEnevoldsen merged 4 commits into
v2.0.0from
array-spec-v2

Conversation

@KennethEnevoldsen

@KennethEnevoldsen KennethEnevoldsen commented Mar 4, 2025

Copy link
Copy Markdown
Contributor

Adressed aspects of #941 by allowing model.encode to return a torch.tensor.

Also does a lot of fixes to array typing.

further problems

  1. Unsqueeze not in array API spec, we should use expand_dims, but torch does not implement this (Support expand_dims pytorch/pytorch#56774). We can use reshape instead.
  2. What to do with torch.functional? We would have to reimplent these to make it work.

We have a few cases of torch.compile. Which are called every time the function is called. They seem to have been added by @orionw, do we expect that these speed this up (they are compiled every time the function is called)

Code Quality

  • Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

  • Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

  • [ NA ] New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
  • Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Adressed aspects of #941

# further problems
1) Unsqueeze not in array API spec, we should use expand_dims, but torch does not implement this (pytorch/pytorch#56774). We can use reshape instead.
2) What to do with torch.functional? We would have to reimplent these to make it work.

We have a few cases of torch.compile. Which are called every time the function is called. They seem to have been added by @orionw, do we expect that these speed this up (they are compiled every time the function is called)
Comment thread mteb/evaluation/evaluators/ClassificationEvaluator.py
Comment thread mteb/encoder_interface.py
@KennethEnevoldsen

Copy link
Copy Markdown
Contributor Author

@orionw would love your opinion here on the compiled functions. This would, e.g., allow us to e.g. replace

torch.nn.functional.normalize -> sklearn.preprocessing.normalize (~eq.)

which allow for both torch tensor and numpy arrays

@orionw

orionw commented Mar 4, 2025

Copy link
Copy Markdown
Contributor

I am not sure why they are compiled every time, they should be compiled once. I added it because on tasks with lots to normalize it was about a 10x speedup.

We can remove it for simplicity, but is is much much faster.

@KennethEnevoldsen

Copy link
Copy Markdown
Contributor Author

Hmm, that is good to know. I will keep it, at least for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants