Zen4-Pro-Max

language

en

zh

ja

ko

fr

de

es

pt

it

ru

ar

nl

pl

sv

da

fi

nb

tr

cs

sk

ro

hu

el

uk

he

license

apache-2.0

Zen4-Pro-Max

Zen4-Pro-Max is an 80B Mixture-of-Experts (MoE) language model with approximately 3B parameters active per forward pass. It is the flagship model of the Zen4 family, designed for large-scale reasoning, complex multi-step tasks, and demanding professional workloads where maximum capability is required.

Model Specs

Property	Value
Parameters	80B total / ~3B active (MoE)
Architecture	Sparse MoE transformer (causal LM)
Context window	32,768 tokens
Format	SafeTensors (BF16), MLX 4-bit
License	Apache 2.0

Zen4 Family

This model is part of the Zen4 model family:

Model	Params	Architecture	Context	Use case
zen4-mini	4B	Dense	40,960	Edge, mobile, low-resource
zen4-pro	14B	Dense	32,768	Professional, complex reasoning
zen4-max	30B MoE	MoE (3B act.)	32,768	High-capability, efficient
zen4-coder	80B MoE	MoE	32,768	Advanced code, 100+ languages
zen4-pro-max	80B MoE	MoE (3B act.)	32,768	Large-scale reasoning

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "zenlm/zen4-pro-max"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {
        "role": "user",
        "content": (
            "Derive the time complexity of Dijkstra's algorithm using a Fibonacci heap, "
            "then compare it with a binary heap implementation across sparse and dense graphs."
        ),
    }
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

output_ids = model.generate(**inputs, max_new_tokens=2048)
generated = output_ids[0][len(inputs.input_ids[0]):]
print(tokenizer.decode(generated, skip_special_tokens=True))

MLX (Apple Silicon)

from mlx_lm import load, generate

model, tokenizer = load("zenlm/zen4-pro-max-mlx")

response = generate(
    model,
    tokenizer,
    prompt="Explain the formal semantics of linear type systems.",
    max_tokens=1024,
    verbose=True,
)

Hardware Requirements

Setup	Minimum VRAM	Recommended
BF16 full precision	160 GB	2× H100 80 GB (NVLink)
4-bit quantized	48 GB	80 GB
MLX 4-bit	48 GB RAM	Apple M2 Ultra / M3 Ultra
CPU inference	160 GB RAM	256 GB RAM

The MoE architecture activates only ~3B parameters per token, meaning compute cost scales with active parameters while memory scales with total parameters.

Formats Available

Format	Location
SafeTensors BF16	`zenlm/zen4-pro-max`
MLX 4-bit	`zenlm/zen4-pro-max-mlx`
GGUF	`zenlm/zen4-pro-max-gguf` (various quants)

Abliteration

Zen4-Pro-Max weights are derived from an abliterated base model. Abliteration removes refusal behavior by identifying and nullifying the "refusal direction" in the model's residual stream.

Method: Directional ablation on the residual stream across all layers Implementation: hanzoai/remove-refusals — Hanzo's production abliteration toolkit Technique: Refusal in LLMs is mediated by a single direction — Arditi et al. Effect: Removes refusal behaviors while preserving all other capabilities Identity layer: Zen identity added via system prompt — full LoRA fine-tuning planned

Abliteration is a feature, not a limitation. It enables unrestricted research, security testing, and applications where safety guardrails are managed at the application layer rather than baked into model weights.

Citation

@misc{zen4promax2025,
  title        = {Zen4-Pro-Max: 80B MoE Flagship Language Model for Large-Scale Reasoning},
  author       = {{Hanzo AI} and {Zoo Labs Foundation}},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/zenlm/zen4-pro-max}},
  note         = {Apache 2.0 License}
}

Part of the Zen model family by Hanzo AI (Techstars '17) and Zoo Labs Foundation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LLM.md		LLM.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zen4-Pro-Max

Model Specs

Zen4 Family

Quick Start

MLX (Apple Silicon)

Hardware Requirements

Formats Available

Abliteration

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Zen4-Pro-Max

Model Specs

Zen4 Family

Quick Start

MLX (Apple Silicon)

Hardware Requirements

Formats Available

Abliteration

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages