WHAT IS AN NPU?
New chips put AI in your hands. Darien Graham-Smith looks at the latest in neural processing
We’re all familiar with the CPU and GPU—they sit at the heart of almost every PC made in the past 40 years (albeit the GPU has sometimes been built into the CPU). Lately, they’ve been joined by a friend: the neural processing unit, or NPU.
The first implementations appeared in mobile chipsets in 2017. But as AI workloads have exploded onto the scene, they’ve become more important. Today, it could be argued that the rise of this new type of processor is the most significant development in systems architecture in 50 years.
While the NPU is simpler and more specialized than either the CPU or GPU, it opens up a whole new dimension of computing capabilities, enabling the sort of complex on-device AI processing that a regular CPU would struggle with.
Microsoft has decreed 40 TOPS to be the minimum for Copilot+ PCs.
© NVIDIA, MICROSOFT
This doesn’t mean that you can run a complete ChatGPT or Midjourney engine on your personal laptop, but it does mean that almost any application can now take advantage of the sort of AI processing functions that power those platforms—and it provides a standard hardware model for improving these capabilities. When you next buy any sort of consumer electronic device, from a high-end laptop to a smart TV or a home security gadget, there’s a good chance that it will include an NPU.
WHAT DOES AN NPU DO?
NPUs are designed for AI operations, which in practice means working with data structures called tensors. Entire books have been written about them, but the simple way to think about one is as a matrix of values with any number of dimensions.
While that may seem like quite an abstract concept, it happens to neatly mirror the way the neurons in our brains connect to store and process information, hence the description of these processors as ‘neural’. Rather than stepping sequentially through a big list of operations, as a CPU might, our brains and NPUs both work on large sets of information at once, identifying relationships between values and applying previously learnt connections to generate new outputs.