Zyphra (@ZyphraAI) / X

Zyphra

267 posts

Zyphra

@ZyphraAI

Full stack open superintelligence

San Francisco, CA

Joined March 2021

Zyphra
@ZyphraAI
16h
Replying to @ZyphraAI
@ZyphraAI is an open superintelligence research and product company based in San Francisco, CA on a mission to build human-aligned AI that helps individuals and organizations reach their fullest potential. Apply to join us!
jobs.ashbyhq.com
Zyphra Jobs
Zyphra Jobs
391
Zyphra
@ZyphraAI
16h
Replying to @ZyphraAI
Zamba2-VL is released under Apache 2.0 at all three scales. Blog: zyphra.com/our-work/zamba… Technical report: arxiv.org/abs/2606.00390 Weights: huggingface.co/collections/Zy… Code: github.com/Zyphra/transfo…
Zyphra
From zyphra.com
514
Zyphra
@ZyphraAI
16h
Replying to @ZyphraAI
See the model in action with these examples below.
243
Zyphra
@ZyphraAI
16h
Replying to @ZyphraAI
Zamba2-VL is competitive with the leading open Transformer vision-language models of comparable scale, including Qwen3-VL, InternVL3.5, Molmo2, and PerceptionLM, across image understanding, reasoning, OCR, grounding, and counting benchmarks.
1K
Zyphra
@ZyphraAI
16h
Replying to @ZyphraAI
Hybrid SSM-Transformer models combine SSM layers for speed and efficiency, with a few attention layers for precise recall. Zamba2-VL includes 1.2B, 2.7B, 7B models and is the first open family of vision-language models built natively on this hybrid architecture.
1K
Zyphra
@ZyphraAI
16h
Zyphra Research continues to explore architecture innovations beyond standard transformers. Today we’re releasing Zamba2-VL, extending our prior Zamba2 hybrid SSM-Transformer work into vision-language modeling. 🧵
24K
Zyphra
@ZyphraAI
Jun 2
MiMo-V2.5-Pro now available on Zyphra Cloud! Huge context, super fast, optimized and served on @AMD MI355X. Full context at $1/M input, $3/M output, $0.2/M cached. Try now at cloud.zyphra.com
56K
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
@ZyphraAI is an open superintelligence research and product company based in San Francisco, CA on a mission to build human-aligned AI that helps individuals and organizations reach their fullest potential. Apply to join us!
jobs.ashbyhq.com
Zyphra Jobs
Zyphra Jobs
523
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
Stay tuned as we extend this to larger dense and MoE models, the backward pass for training, and validation within production serving environments. Zyphra will continue pushing performance across new hardware ecosystems. Technical details on the blog:
Zyphra
From zyphra.com
626
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
The optimization philosophy we use on other stacks (topology-aware parallelism, custom kernels, communication scheduling) applies to Trainium/Inferentia, demonstrating our heterogeneous silicon capabilities while showing one way Neuron can be improved for the wider ecosystem.
153
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
The more accelerators we split the model across, the more communication there is to overlap, so the gains grow with scale. Throughput increased while TTFT and TPOT decreased, with the largest gains at 24 NeuronCores.
144
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
We built on existing NKI kernels in AWS's Neuron stack and added a Domino-style schedule that overlaps compute with chip-to-chip communication. Each transformer block keeps the compute engines busy while data moves between accelerators instead of stopping to wait.
769
Zyphra
@ZyphraAI
May 29
Replying to @ZyphraAI
Trainium/inferentia runs communication on dedicated cores in parallel with its tensor, vector, and scalar engines. Combined with large HBM capacity and a fast scale-up fabric, it suits workloads limited more by data movement like decode, MoE, and long-context attention.
902