This is more a blog entry than a LinkedIn post, but I don’t have a blog, so … Even when applying a robust CES hype filter, this Alpamayo announcement from #Nvidia is really interesting: https://lnkd.in/gsaKstny
It’s interesting because many knowledgeable folks would say that currently, Waymo is winning in the AV space because it chose the best path to solving the problem of autonomous driving, and executed on it better than others. But what if, after well over a decade of development, it wasn’t the best path?
A little background for those in my network who are not in the AV space …
At the risk of angering my engineering friends, the last decade+ of AV development has been a grand test between two schools of thought: what I’ll call the “modular” approach, characterized by specialized AI models, many sensors, and detailed maps (Waymo, Cruise), and the end-to-end approach, characterized by a “large driving model” akin to a large language model, simple sensors, and less map dependence (Tesla). Some might quibble with these descriptions, but I am going to stick with them for simplicity.
Many would say the modular approach is “winning” because even though Waymo is relying on a hardware- and map-heavy solution, it already has driverless technology that is very safe and reliable and that people will pay for in ridehail form today. Which means Waymo is now getting paid to accumulate miles as it continues working on pulling cost out of its solution to enable production of AVs most people could afford.
An E2E solution, on the other hand, allows for an AV that most private car owners could afford today (Tesla). But the E2E solution currently lacks the safety and reliability that Waymo already has. Tesla can sell its cars to make money as it improves its E2E model, but in the areas Waymo chooses to operate, its stack outperforms Tesla’s (sorry Tesla fans, it’s really not a close call).
Eventually, these two approaches will merge. Modular approaches, having already achieved high levels of reliability and safety, will become cheaper to buy and less capital intensive to operate as they become less hardware- and map-dependent. E2E solutions will achieve higher levels of safety and reliability. The multi-billion dollar question is whether one approach will prove to be meaningfully better over time.
When bets play out over many years, as is happening in the AV space, it can be difficult to say who is “winning.” The inherent risk in the modular approach was always that at some point, an E2E driving model would hit a point of rapid improvement (like we have seen recently with language models) and leapfrog the early wins extracted from a modular solution, unlocking quicker scaling with less unit cost and capital intensiveness.
Did this just happen with Alpamayo? Open-sourcing and exposed reasoning are certainly signs of confidence from a company that has not been prone to missteps of late.