Pinned
Imagine a breakthrough in inference that’s model-agnostic and infra-agnostic across GPUs and ASICs
Same workloads
No loss in performance
Roughly half the cost
At 2x the speed
I guess it’s a fantasy but just imagine if it wasn’t!
How cool would that be huh



