Hi, is there an interface to specify logits processors as in vLLM? If possible, could you specify how we can customize the sampling behavior during generation?
Hi, is there an interface to specify logits processors as in vLLM?
If possible, could you specify how we can customize the sampling behavior during generation?