Misc. bug: convert_hf_to_gguf.py does not support Nemotron3 NVFP4

### Name and Version

build b8304 + I have tried script versions from commit https://github.com/ggml-org/llama.cpp/commit/7229adeb2b0a0b78bc5087c45339c2b912f80b16 and from commit https://github.com/ggml-org/llama.cpp/commit/9ff11f552059ea3fc41617ff72f62b44f1f95dff 

```
-rw-r--r--  1 username username    592218 Mar 13 14:00 convert_hf_to_gguf__b8304.py
-rw-r--r--  1 username username    578627 Mar 13 14:04 convert_hf_to_gguf__c7229ade.py
-rw-r--r--  1 username username    595668 Mar 13 13:59 convert_hf_to_gguf__c9ff11f5.py
```

all 3 versions produce the same error message.

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

Python/Bash scripts

### Command line

```shell
$ python convert_hf_to_gguf.py --outfile NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.gguf --outtype auto --verbose --dry-run /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/
```

### Problem description & steps to reproduce

Hello, I'm getting error "Quant method is not yet supported: 'modelopt'" when trying to convert `NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4` ( https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/ ) to `.gguf`


### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console

$ python convert_hf_to_gguf__c7229ade.py --outfile /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.gguf --outtype f16 --verbose --dry-run /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/
INFO:hf-to-gguf:Loading model: NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
 You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: NemotronHForCausalLM
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
 You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00017.safetensors'
...
INFO:hf-to-gguf:gguf: indexing model part 'model-00017-of-00017.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
 You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Exporting model...
Traceback (most recent call last):
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 12163, in <module>
    main()
    ~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 12157, in main
    model_instance.write()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 715, in write
    self.prepare_tensors()
    ~~~~~~~~~~~~~~~~~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 9884, in prepare_tensors
    super().prepare_tensors()
    ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 2745, in prepare_tensors
    super().prepare_tensors()
    ~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 555, in prepare_tensors
    self.dequant_model()
    ~~~~~~~~~~~~~~~~~~^^
  File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 474, in dequant_model
    raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}")
NotImplementedError: Quant method is not yet supported: 'modelopt'

```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: convert_hf_to_gguf.py does not support Nemotron3 NVFP4 #20504

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: convert_hf_to_gguf.py does not support Nemotron3 NVFP4 #20504

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions