Checklist
Motivation
NVFP4 support for not FP4 GPUs by Marlin fallback as it exists in VLLM. It allows the usage of FP4 quant that is more accurate than using AWQ on 3090 generation gpus. Now it only says that it is not supported and crashes. I am forced to use vllm for running Minimax M2.5 NVFP4 although I prefer Sglang..
Related resources
No response
Checklist
Motivation
NVFP4 support for not FP4 GPUs by Marlin fallback as it exists in VLLM. It allows the usage of FP4 quant that is more accurate than using AWQ on 3090 generation gpus. Now it only says that it is not supported and crashes. I am forced to use vllm for running Minimax M2.5 NVFP4 although I prefer Sglang..
Related resources
No response