Fix head_size in NeMo to HF checkpoint converters for width pruned model support#11230
Conversation
318009a to
c6975a1
Compare
|
You need to signoff commit: |
|
FYI) @kevalmorabia97 I put manually modified config file to
|
1e9c039 to
fac2270
Compare
|
Thanks @kevalmorabia97 |
Signed-off-by: Joosung <joosungy@nvidia.com>
fac2270 to
5695859
Compare
|
@cuichenx can you please comment on the tokenizer issue posted above? Does that require some other parameters to be updated to tokenizer loading logic? |
|
can you try setting |
@cuichenx
|
|
@cuichenx seems like with different output path as well there is the same issue with tokenizer. Can you suggest how to fix that? |
|
I'm not sure. @tdene added support for the tokenizer. Do you have any idea? |
|
Discussed offline. My comments on how to fix the tokenizer-related issue are as follows: Change the script explanation to Change L248 through 257 to Change L263 through 266 to |
Signed-off-by: Joosung <joosungy@nvidia.com>
…del support (NVIDIA-NeMo#11230) * update attn head_size to kv_channels for width pruning support Signed-off-by: Joosung <joosungy@nvidia.com> * Update llama ckpt converter usage about tokenizer args Signed-off-by: Joosung <joosungy@nvidia.com> --------- Signed-off-by: Joosung <joosungy@nvidia.com> Co-authored-by: Joosung <joosungy@nvidia.com>
…del support (NVIDIA-NeMo#11230) * update attn head_size to kv_channels for width pruning support Signed-off-by: Joosung <joosungy@nvidia.com> * Update llama ckpt converter usage about tokenizer args Signed-off-by: Joosung <joosungy@nvidia.com> --------- Signed-off-by: Joosung <joosungy@nvidia.com> Co-authored-by: Joosung <joosungy@nvidia.com>
…del support (NVIDIA-NeMo#11230) * update attn head_size to kv_channels for width pruning support Signed-off-by: Joosung <joosungy@nvidia.com> * Update llama ckpt converter usage about tokenizer args Signed-off-by: Joosung <joosungy@nvidia.com> --------- Signed-off-by: Joosung <joosungy@nvidia.com> Co-authored-by: Joosung <joosungy@nvidia.com> Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
What does this PR do ?
Changelog
head_size = model.cfg.get("kv_channels") or (hidden_size // head_num)Usage
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
Who can review?
@kevalmorabia97
Additional Information
Pruned values corresponding to huggingface's config.json should be updated.
I added
head_dim (hf)to hf's config.json with the same value ofkv_channels (nemo)