CUDA (Compute Unified Device Architecture) provides a powerful parallel programming model AI engineers can use to tap the massive processing power of NVIDIA GPUs. This guide shows you how to work within the CUDA ecosystem, from your first kernel to implementing advanced LLM features like Flash Attention.
Plus, the same offer also applies to AI Agents in Action, Second Edition and AI Agents and Applications.
Sign up for Deal of the Day alerts from Manning!