We’re taking the biggest bet in AI - a chip that can only run transformers, but does so orders of magnitude faster than GPUs. Maybe attention *is* all you need…
Meet Sohu, the fastest AI chip of all time.
With over 500,000 tokens per second running Llama 70B, Sohu lets you build products that are impossible on GPUs. One 8xSohu server replaces 160 H100s.
Sohu is the first specialized chip (ASIC) for transformer models. By specializing,















