Skip to content

Add an Sequence object to the decoders #872

@SaulLu

Description

@SaulLu

I wonder if it would be useful to have a sequence object for the decoders too.

It seems to me for example that if we build a tokenizer with a BPE model that defines a end_of_word_suffix, we will need to use the BPEDecoder decoder to replace theend_of_word_suffix and if we also used a ByteLevel pre-tokenization we will need the ByteLevel decoder to realign the codes.

At the moment, it seems to me that we don't have a solution to choose a suitable decoder for such a tokenizer.

What do you think? 😄

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions