Add CFG to vllm serving#517
Conversation
| # Sets default for the model (`facebook/opt-125m`) | ||
| engine = AsyncLLMEngine.from_engine_args(engine_args) | ||
|
|
||
| _adapt_tokenizer(engine.engine.tokenizer) |
There was a problem hiding this comment.
Why are you calling this function here? The result is not used.
There was a problem hiding this comment.
The tokenizer is changed inside the function anyways. I now assigned it to the tokenizer though.
There was a problem hiding this comment.
Ah makes sense. It's not needed however as vLLM handles tokenisation on its end during encoding/decoding.
There was a problem hiding this comment.
It's needed because in here https://github.com/outlines-dev/outlines/blob/fde61a80a58de0401fdecdee7408db53e17ca4f4/outlines/fsm/fsm.py#L345 outlines expects tokenizer to return a list but vllm tokenizers return string.
Also I just realized that this change makes a breaking change to the library. If it's fine by the project directors its fine, if not we might need a change.
for example we can call _adapt_tokenizer inside __init__ functions of the logit processors
012a0e9 to
281c0af
Compare
|
Thank you for your contributions! I added some documentation before merging. |
Hi,
This pull request adds support for CFG in vllm serving.