Fix type notation of merges in BPE Python binding#1685
Fix type notation of merges in BPE Python binding#1685Coqueue wants to merge 1 commit intohuggingface:mainfrom
Conversation
|
Why do you think the annotations are incorrect ? The 2 things are unlinked and the signature here is correct. |
|
Hi Nicolas, thanks for taking a look.
I wonder if this is true. if vocab is not None and merges is not None:
tokenizer = Tokenizer(BPE(vocab, merges, dropout=dropout, unk_token=unk_token, fuse_unk=fuse_unk))where the
class BPE(Model):
"""
An implementation of the BPE (Byte-Pair Encoding) algorithm
Args:
vocab (:obj:`Dict[str, int]`, `optional`):
A dictionary of string keys and their ids :obj:`{"am": 0,...}`
merges (:obj:`List[Tuple[str, str]]`, `optional`):
A list of pairs of tokens (:obj:`Tuple[str, str]`) :obj:`[("a", "b"),...]`At least this comment gives a different type annotation for
I encountered an issue initializing The type annotation of As you closed the PR as a repo collaborator, I'll provide the above context -- I'll be more than happy to learn more if I was wrong. |
|
@Narsil Hi Nicolas, appreciate it if you could take a look at the above reply. |
Heya, this change is to correct type notations of
mergesin Python binding of BPE as the counterpart in Rust implementation expects a file name string orVec<(String, String)>instead of a mapping from a tuple of integers to another.