Skip to content

[Feature] NVFP4 support for not FP4 GPUs by Marlin fallback #19491

@ciprianveg

Description

@ciprianveg

Checklist

Motivation

NVFP4 support for not FP4 GPUs by Marlin fallback as it exists in VLLM. It allows the usage of FP4 quant that is more accurate than using AWQ on 3090 generation gpus. Now it only says that it is not supported and crashes. I am forced to use vllm for running Minimax M2.5 NVFP4 although I prefer Sglang..

Related resources

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions