Conversation
5cf0901 to
c343e77
Compare
363da49 to
0807173
Compare
0807173 to
c85cf2a
Compare
|
LGTM |
|
btw, I have one small question, why switch the max_num_tokens from 1 to 0? |
Did you mean switch from 0 to 1? This is because our default logic for generation model suppose it will generate 1 token after prefill. Although embedding models do not generate tokens, I added one dummy token to cheat with the shared part of the code. |
I see, thanks for answering |
This is a follow-up PR for
For embedding API, input as a list will be covered in the next PR.