[v2] Two stage rerank#3040
Conversation
There was a problem hiding this comment.
Hmm thanks for making the PR @Samoed.
I really don't like the save_retrieval_results=True argument as it clearly a task specific argument.
I do however like that it is possible to create a reranking from a retrieval task. How about
import mteb
model = mteb.get_model("minishlab/potion-base-2M")
retrieval_task = mteb.get_task("NanoArguAnaRetrieval")
model = SaveRetrievaPredictionsWrapper(model)
res = mteb.evaluate(model, retrieval_task)
reranking_task = retrieval_task.convert_to_reranking(model.predictions, top_k=100)
model = mteb.get_model("jinaai/jina-reranker-v2-base-multilingual")
mteb.evaluate(model, reranking_task)Alternatively, we could add a more general save_predictions flag, that can be used for all tasks and just add a NotImplementedError() for all tasks but retrieval. I could see that as being an option as well. Besides the many not-implemented errors this would cause, it would also lead to a requirement to handle new arguments, such as where to save predictions, etc.
The Wrapper isolate the concern and leave the evalautor only focused on evaluation
| assert result.returncode == 0, "Command failed" | ||
|
|
||
|
|
||
| def test_save_predictions(): |
There was a problem hiding this comment.
isn't this still used by MTEB()?
There was a problem hiding this comment.
Yes, this still used by MTEB, but I don't want to support it, because otherwise it would required to support a lot of extra parameters in AbsRetrieval which we don't want I think
There was a problem hiding this comment.
So this will break MTEB()?
There was a problem hiding this comment.
I've chedked a bit futher. It will break only cli for some part, becuase we don't support this flag for now, but we need to refactor CLI a bit to use mteb.evaluate instead of MTEB. We can add support of this flag to save backward compatibility. But this flag is not supported by MTEB directly, it's just passed thought kwargs and not documented
Lines 254 to 266 in cf01bc8
There was a problem hiding this comment.
Let is keep it like this, but remove it from the CLI args/docs
There was a problem hiding this comment.
If we implement an additional method for save results, I think we can leave it as is
I defenetly agree, but I don't see another solution to the problem. And this is because I didn't want to add it implement firstly.
I don't think that this would solve the problems, because models can save only embedings, but not predictions for the retrieval task. I don't know how resuse results from the wrapper then.
I can make implementation for it, because this would solve problems for the tasks |
Why can't they save predicitons? doesn't the search interface return RetrievalOutputType?
Let us decide on a good solutions before spending the implementation time |
Oh, I forgot that we have search interface. Yes, wrapper should work then |
orionw
left a comment
There was a problem hiding this comment.
Sorry for the late review. In general this looks fine to me.
As a meta-comment though, I will say I'm nervous we're adding too many wrappers. Do they stack? Can we save predictions and cache embeddings and ...?
Maybe a personal preference but I think arguments to functions are a much nicer UI from a users perspective than doing a bunch of wrappers. However, we have so many it's probably too late to change all of these at this point. Just worth noting for future extensions.
I thought about this, too. This will be like
In new Lines 162 to 171 in cf01bc8 |
| def __init__( | ||
| self, | ||
| model: SearchProtocol, | ||
| results_path: str | Path, |
There was a problem hiding this comment.
So based on @orionw and @Samoed comment, I might be better to add the following arguments:
save_retrieval_predictions: bool
prediction_save_path: Path
And then in the future we can convert raise a depreciationwarning on save_retrieval_predictions and instead refer to save_predictions.
let me know what you think?
(sorry @Samoed i understand that this is a digression from before)
There was a problem hiding this comment.
We might still want to keep the wrapper internally (without exposing it to the user)
There was a problem hiding this comment.
And then in the future we can convert raise a depreciationwarning on save_retrieval_predictions and instead refer to save_predictions.
Where do you want to add it? To mteb.evaluate?
There was a problem hiding this comment.
Yea that would be the idea
There was a problem hiding this comment.
Should we add method save_predictions to all tasks then? And we can use only prediction_save_path: Path then
There was a problem hiding this comment.
Hmm but that required a lot of implementation across the different tasks?
There was a problem hiding this comment.
Yes, but I don't like that we're passing task-specific args. Also, for some users this can be helpful
There was a problem hiding this comment.
Sure, it was simply that we don't have it implemented yet so I am afraid it would give a poor user experience
There was a problem hiding this comment.
I will implement it for retrieval in this pr and will add to other tasks later
There was a problem hiding this comment.
Alright let us do it this way (sorry for backtracking)
KennethEnevoldsen
left a comment
There was a problem hiding this comment.
can you redo the example in the top. I think this format looks pretty good though. Few updates on naming things
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
|
Updated example in description |
|
Looks great @Samoed - I think this is good to merge - we need to update some docs away from using MTEB() for two-stage retrieval, but I can do that in a separate PR |
I've created 2 stage reranking. Here is example how to run it:
I've made saving information about run results to result file for better reproduction.