Adding support for multiple mask tokens.#14716
Conversation
a3fca5e to
34c02f4
Compare
8068d62 to
61686e2
Compare
- Original implem: huggingface#10222 Co-authored-by: njafer <naveen.jafer@oracle.com>
we add information to the tasks to specify tasks where we know for sure if we need the tokenizer/feature_extractor or not.
61686e2 to
5b0189e
Compare
| NO_FEATURE_EXTRACTOR_TASKS = set() | ||
| NO_TOKENIZER_TASKS = set() | ||
| for task, values in SUPPORTED_TASKS.items(): | ||
| if values["type"] == "text": | ||
| NO_FEATURE_EXTRACTOR_TASKS.add(task) | ||
| elif values["type"] in {"audio", "image"}: | ||
| NO_TOKENIZER_TASKS.add(task) | ||
| elif values["type"] != "multimodal": | ||
| raise ValueError(f"SUPPORTED_TASK {task} contains invalid type {values['type']}") |
There was a problem hiding this comment.
I like this approach, it should make the pipelines more robust to models with different capabilities in terms of preprocessors.
There was a problem hiding this comment.
Looks good to me as well!
| # than them | ||
| self.assertEqual(len(outputs), 3) | ||
|
|
||
| def fill_mask_with_multiple_masks(self, model, tokenizer): |
There was a problem hiding this comment.
Can we perhaps add a test for Perceiver (similar to the image classification models)?
Or is this not required here?
There was a problem hiding this comment.
I think this PR is pretty orthogonal to Perceiver.
We could add a slow test for sure, but it doesn't have to be Perceiver specific.
In fact, I'll add something on the random model (it just needs to be consistent, actual values are less important)
LysandreJik
left a comment
There was a problem hiding this comment.
Ok this looks good to me! Looking forward to the additional test, feel free to merge whenever.
What does this PR do?
When presented multiple masks, it's impossible to retrieve the conjugate probabilities.
Instead of trying to workaround that (see discussions in previous PR) this PR instead
just outputs the raw
top_kproposition at each locus, since it gets trickier to find a goodproxy for "joint probabilities". Instead of trying to solve this impossible problem we simply
show exactly what the model outputs.
@naveenjafer mentionned as co-author since much of this PR was pulled from there.
This PR was resurrected partly because Perceiver (byte-level model) need to do this type of masking to be useful.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.