The first token now arrives 87% faster and entire responses are 32% faster, p50,
in Amp's deep and rush modes.
How? Mostly using websockets for communication with OpenAI, partly because we rebuilt Amp to be much faster last month.
These gains matter most on long-horizon tasks, where we're seeing up to a 40% end-to-end speedup from user prompt submission to completion.
You can see the difference: