The future of AI is… analog? Upstart bagged $100M to push GPU brains on less power

AI Chip Startup EnCharge claims that its analog artificial intelligence accelerations could rival desktop graphics cards while using only a fraction of their power. Impressive, at least on paper. The hard part is to prove it in real life.

The outfit boasts that it has developed a novel, in-memory computing architecture for AI inferencing. It replaces traditional transistors by analog capacitors in order to achieve a performance-per watt advantage of 20x over digital accelerators like GPUs.

EnCharge CEO Naveen verma claims that the inference chip of EnCharge delivers 150 TOPS AI computation at 8-bit accuracy on only one watt. Verma claims that if you scale it up to 4.5 Watts, it can match desktop GPUs – but with 1/100th of the power consumption. At least, that’s what the pitch is. This isn’t just theoretical. EnCharge’s chip was spun out from Verma’s Princeton lab where they were developed in conjunction with the United States Defense Advanced Research Projects Agency aka DARPA and Taiwanese giant TSMC. Verma told me that the business has taped out several test chip to prove that the architecture works. He said

“The products we’re building are actually based on a fundamental technology that came out of my research lab,” . EnCharge, with $100 million in new Series-B funding by Tiger Global, RTX and others, plans to tape out their first production chips for PCs, mobiles and workstations this year. Verma says that the real difference lies in the way and where the chip performs computation. The majority of genAI computation today is done by using multiple multiply accumulate units (or MAC) for short.

Traditional architectures are built with billions of transistors gates that operate on discrete numbers due to the binary representation. Verma says that this approach can be improved and made more precise by using continuous values instead of discrete ones. EnCharge’s MACs use analog capacitors that can represent any continuous signal value based on the charge level. Verma explained that because capacitors are essentially two conductors separated with a dielectric, they can be easily etched into silicon by existing CMOS technology.

EnCharge’s second design element is that analog computations are handled in memory.

The concept of in-memory computing is not new. Since years, several companies have been developing AI accelerators that are based on this concept. This concept works by embedding compute, often in the form a bunch math circuits, into the memory. Matrixes are then calculated in place without having to move the data around.

EnCharge’s design uses analog capacitors to perform this calculation by adding up the charges. He said. Intel sinks $19B in the red, kills Falcon Shores graphics cards, and delays Clearwater Forest Xeons. Verma said.

Speaking of programmability: EnCharge’s chip is capable of supporting a wide range of AI workloads, from convolutional networks to the transformer architectures that are behind large language models and diffusion models.

The design of an inference chip will vary depending on its target workload. For some workloads factors like memory capacity and bandwidth can have a greater impact on performance than the raw compute.

For example, large language models tend to be heavily memory-bound, with memory capacity and bandwidth having a greater impact on perceived performance than how many TOPS they can produce. Verma says that an EnCharge chip aimed at these workloads may dedicate less die space to computation to make room for a larger memory bus.

On a different note, for diffusion models that aren’t as memory-bound, you might need more compute to generate images faster. EnCharge will stick to M.2 and PCIe add in cards for now due to their ease of use. We’ve seen lower-power accelerators packaged in the same form factor before, such as Google’s Coral NPUs and Hailo’s NPUs. Verma stated that the technology can be adapted to larger, higher-wattage devices in the future. “Fundamentally, the ability to grow to 75 watt PCIe cards and so on is all there.”

EnCharge’s first batch of production chips will be taped out by the end of the year. However, it may take some time before the chips are widely adopted as the startup is still working to integrate the EnCharge chips into the designs of their customers and develop a software pipeline. (r)

www.aiobserver.co

More from this stream

Recomended