Integrate Birco V2#2022
Conversation
|
@orionw Can you please check this? |
|
I've changed yor base branch to v2, because in your history was a lot of PRs from there. |
Thanks, I was just about to do that. Feel free to also give it a look. It is quite different from the previous version because v2 already has many of the implementations. |
|
This pull request and #1947 are adding the same tasks? If so, maybe the old pull request should be closed? |
Samoed
left a comment
There was a problem hiding this comment.
Can you run tasks to make sure that score of tasks is matching with author's implementation?
|
Hi @Samoed I've refactored the datasets. I preprocessed them externally, and now they look like the retrieval/reranking datasets on HF. Hence, I've deleted all the transforming/loading. I also fixed formatting like citations and other points you mentioned. They should work now. For some reason, I cannot test them locally. The first time I ran the tests, they failed because they expected corpus, queries, and qrels as configs instead of default, so I did that. When I try to run tests now, it says it expects default as a config, so I am not sure. |
|
I think issue with eval splits. In TaskMetadata you've annotated that they will use "test" split, but on huggingface you've uploaded train |
Samoed
left a comment
There was a problem hiding this comment.
Can you please get back the .vscode folder and upload the dev and test splits for the datasets, as it is not currently clear what "train" is, as it was not in the original dataset?
|
I added the vscode folder. There is only a test split for the dataset as far as I am aware because we are just evaluating. Can you please run it from your side and tell me if it works? |
|
Yes, I'll try to run it. Can you resolve merge conflict? |
|
Ah, yes. You should also calculate |
|
Do I add them in the classes for each task? And under which function? |
|
You should call this function for each of the tasks. This function will produce json in |
|
I am getting this error when I try to calculate metrics: Isn't it expected to have corpus, queries, qrels as configs? |
|
I tried changing qrels to default because apparently some datasets have it this way, but it is giving me this: Can you please tell me what structure is the correct one? |
|
I can download |
| date=("2024-01-01", "2024-12-31"), | ||
| domains=["Academic"], # Valid combination | ||
| task_subtypes=["Scientific Reranking"], # MTEB-approved subtype | ||
| license="https://creativecommons.org/licenses/by/4.0/", # Full URL |
There was a problem hiding this comment.
| license="https://creativecommons.org/licenses/by/4.0/", # Full URL | |
| license="cc-by-4.0", |
I still haven't commited that revision. I've only made the change on HF. This revision has default as a config instead of qrels. Is this how it's supposed to be? |
|
Yes, that dataset has right split format |
|
Fixed License and now all datasets have default, queries, and corpus as configs. |
|
Hi @Samoed Another dataset seems to be causing the tests to fail "Namaa." Can you please confirm? |
|
Yes, but that a bit weird, because I can find it on HF |
Closes #818