-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Open
Labels
Description
Name and Version
build b8304 + I have tried script versions from commit 7229ade and from commit 9ff11f5
-rw-r--r-- 1 username username 592218 Mar 13 14:00 convert_hf_to_gguf__b8304.py
-rw-r--r-- 1 username username 578627 Mar 13 14:04 convert_hf_to_gguf__c7229ade.py
-rw-r--r-- 1 username username 595668 Mar 13 13:59 convert_hf_to_gguf__c9ff11f5.py
all 3 versions produce the same error message.
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
Python/Bash scripts
Command line
$ python convert_hf_to_gguf.py --outfile NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.gguf --outtype auto --verbose --dry-run /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/Problem description & steps to reproduce
Hello, I'm getting error "Quant method is not yet supported: 'modelopt'" when trying to convert NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 ( https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/ ) to .gguf
First Bad Commit
No response
Relevant log output
Logs
$ python convert_hf_to_gguf__c7229ade.py --outfile /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.gguf --outtype f16 --verbose --dry-run /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4/
INFO:hf-to-gguf:Loading model: NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Model architecture: NemotronHForCausalLM
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:gguf: loading model weight map from 'model.safetensors.index.json'
INFO:hf-to-gguf:gguf: indexing model part 'model-00001-of-00017.safetensors'
...
INFO:hf-to-gguf:gguf: indexing model part 'model-00017-of-00017.safetensors'
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
WARNING:hf-to-gguf:Failed to load model config from /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4: The repository /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 contains custom code which must be executed to correctly load the model. You can inspect the repository content at /mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4 .
You can inspect the repository content at https://hf.co//mnt/LLM/nemotron/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4.
Please pass the argument `trust_remote_code=True` to allow custom code to be run.
WARNING:hf-to-gguf:Trying to load config.json instead
INFO:hf-to-gguf:Exporting model...
Traceback (most recent call last):
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 12163, in <module>
main()
~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 12157, in main
model_instance.write()
~~~~~~~~~~~~~~~~~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 715, in write
self.prepare_tensors()
~~~~~~~~~~~~~~~~~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 9884, in prepare_tensors
super().prepare_tensors()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 2745, in prepare_tensors
super().prepare_tensors()
~~~~~~~~~~~~~~~~~~~~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 555, in prepare_tensors
self.dequant_model()
~~~~~~~~~~~~~~~~~~^^
File "/home/username/venv/convert_hf_to_gguf__c7229ade.py", line 474, in dequant_model
raise NotImplementedError(f"Quant method is not yet supported: {quant_method!r}")
NotImplementedError: Quant method is not yet supported: 'modelopt'
Reactions are currently unavailable