Description
SafeTensors models using non-LLaMA architectures fail the Check gate because the tensor name mapping only recognizes LLaMA-style weight names.
Affected Models (all 10/11 or worse)
- BERT/encoder: BAAI/bge-small-en-v1.5, sentence-transformers/all-MiniLM-L6-v2
- GPT-2: openai-community/gpt2, openai-community/gpt2-medium (missing config.json)
- GPT-NeoX: EleutherAI/gpt-neo-125m, EleutherAI/pythia-410m-deduped
- OPT: facebook/galactica-125m
- PhiForCausalLM: microsoft/phi-1.5
- StarCoder: bigcode/tiny-starcoder_py, bigcode/starcoder2-3b
Error Patterns
Tensor not found with names: 'model.embed_tokens.weight', 'token_embd.weight', or 'embed_tokens.weight'
config.json not found (required for SafeTensors inference)
config.json missing num_attention_heads
Root Cause
The SafeTensors loader in realizar only maps LLaMA-style tensor names. Each architecture family uses different naming:
- GPT-2:
transformer.h.N.attn.c_attn.weight
- GPT-NeoX:
gpt_neox.layers.N.attention.query_key_value.weight
- OPT:
model.decoder.layers.N.self_attn.k_proj.weight
- BERT:
encoder.layer.N.attention.self.query.weight
- StarCoder:
model.layers.N.self_attn.o_proj.bias
- Phi:
model.layers.N.self_attn.dense.weight
Expected Behavior
Add architecture-detection and weight-name mapping for major model families.
Description
SafeTensors models using non-LLaMA architectures fail the Check gate because the tensor name mapping only recognizes LLaMA-style weight names.
Affected Models (all 10/11 or worse)
Error Patterns
Tensor not found with names: 'model.embed_tokens.weight', 'token_embd.weight', or 'embed_tokens.weight'config.json not found (required for SafeTensors inference)config.json missing num_attention_headsRoot Cause
The SafeTensors loader in realizar only maps LLaMA-style tensor names. Each architecture family uses different naming:
transformer.h.N.attn.c_attn.weightgpt_neox.layers.N.attention.query_key_value.weightmodel.decoder.layers.N.self_attn.k_proj.weightencoder.layer.N.attention.self.query.weightmodel.layers.N.self_attn.o_proj.biasmodel.layers.N.self_attn.dense.weightExpected Behavior
Add architecture-detection and weight-name mapping for major model families.