llama : add support for GPT2, Bloom and CodeShell tied word embeddings by CISC · Pull Request #12456 · ggml-org/llama.cpp

CISC · 2025-03-18T19:48:30Z

Also remove weight duplication from said models on conversion.

I've converted and tested the following models, confirming that they do not initially have output weights (except for CodeShell, see below) but rely on word embeddings and output weights being tied together at runtime:

openai-community/gpt2
bigscience/bloomz-560m
WisdomShell/CodeShell-7B-Chat
WisdomShell/Shell-7B-Chat

For some reason CodeShell has inverted ties; output weights are provided in the bin/safetensors, but not word embeddings, even though our conversion code seems to imply otherwise.

Added a workaround for transformer.wte.weight being in the CodeShell weight map even though it's not in the tensor file(s), causing a conversion error unless you edit the .index.json file.

It appears transformer.wte.weight is in the weight map even though the weights are not there, remove it if output weights are encountered first.

CISC added 2 commits March 18, 2025 19:54

Add support for GPT2, Bloom and CodeShell tied word embeddings

7761258

Deduplicate tied word embeddings weights

faea5ff

CISC requested a review from ngxson March 18, 2025 19:48

github-actions bot added the python python script changes label Mar 18, 2025

CISC added 3 commits March 18, 2025 23:44

Workaround for incorrect weight map

dc36338

It appears transformer.wte.weight is in the weight map even though the weights are not there, remove it if output weights are encountered first.

check++

0cc8cb5

fatfingers--

1b2a53a

ngxson approved these changes Mar 19, 2025

View reviewed changes

ngxson merged commit 108e53c into ggml-org:master Mar 19, 2025
50 checks passed

CISC deleted the tied-word-embeddings branch March 19, 2025 08:09

CISC mentioned this pull request Mar 25, 2025

GPT2: llama_model_load: error loading model: missing tensor 'output.weight' #12567

Closed

compilade mentioned this pull request Jul 22, 2025

convert : handle pre-quantized models #14810

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : add support for GPT2, Bloom and CodeShell tied word embeddings#12456

llama : add support for GPT2, Bloom and CodeShell tied word embeddings#12456
ngxson merged 5 commits intoggml-org:masterfrom
CISC:tied-word-embeddings

CISC commented Mar 18, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CISC commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CISC commented Mar 18, 2025 •

edited

Loading