The next generation of neural network could be embedded in hardware

The networks programmed directly into the computer chip hardware are able to identify images much faster and use less energy than traditional neural networks, which underpin many modern AI systems. According to research presented at a leading conference on machine learning in Vancouver last week, this is the case.

Perceptrons are highly simplified simulations that are used to build neural networks. These include GPT-4 and Stable Diffusion. Perceptrons are very powerful, but they consume a lot of energy. Microsoft has signed a deal to reopen Three Mile Island in order to power its AI advances.

Part of the trouble is that perceptrons are just software abstractions–running a perceptron network on a GPU requires translating that network into the language of hardware, which takes time and energy. By building a network from hardware components, you can avoid a lot of these costs. One day they may even be built into the chips of smartphones and other devices. This would drastically reduce the need to send and receive data from servers.

Felix Petersen has a plan to make this happen. He did the work as a Stanford University postdoctoral research fellow. He designed networks made up of logic gates. These are the basic building blocks for computer chips. The logic gates are made up of a few individual transistors and accept two bits.1s or 0S–as inputs, and according to a specific transistor pattern, they output a single bit. Like perceptrons logic gates can also be chained together to form networks. Running logic-gate networks can be cheap, quick, and easy. In his presentation at the Neural Information Processing Systems conference (NeurIPS), Petersen stated that they consume less than perceptron network by a factor hundreds of thousands.

Logic-gate networks perform less well than traditional neural networks at tasks such as image labeling. Zhiru Zhi, professor of electrical and computing engineering at Cornell University, says that the approach is promising because it is fast and efficient. “If we close the gap, this could potentially open up many possibilities on this edge machine learning,” says Zhang.

Petersen did not set out to find ways to build energy efficient AI networks. He became interested in logic gates because he was looking for strategies to transform certain mathematical problems into a calculus-solvable form. “It started as a mathematical curiosity and methodological curiosity,” says he.

This approach was a natural fit for backpropagation. It is the algorithm that enabled the deep-learning revolution. Backpropagation is based on calculus and therefore cannot be used to train logic gate networks. Logic gates are only compatible with logic gates. 0s and 1Calculus requires answers to all fractions between s and s. Petersen created functions that behave like logic gates to “relax”or allow backpropagation, logic-gate networks. 0s and 1He ran simulations with these gates through training and then converted the relaxed logic-gate network back into something that he could implement in computer hardware. He trained on simulated networks using these gates, and then converted that relaxed logic-gate network into something he could implement as computer hardware.

This approach has the disadvantage that it is difficult to train relaxed networks. Each node could be any of 16 different logic gate types, and the 16 probabilities that are associated with each logic gate must be tracked and constantly adjusted. Petersen stated that the training of his neural networks takes hundreds of thousands of times more time than conventional neural networks using GPUs. This amount of GPU time is difficult to manage at universities that can’t afford hundreds of thousands of GPUs. Petersen developed these networks in collaboration with colleagues from Stanford University and University of Konstanz. He says that the difficulty of research is a result.

After the network is trained, however, things become much cheaper. Petersen compared the logic-gate networks to other ultra-efficient networks such as binary neural network, which uses simplified perceptrons, which can only process binary values. The logic-gate networks performed just as well as other efficient methods in classifying images from the CIFAR-10 dataset, which includes ten different categories of low resolution pictures, ranging from “frog” through “truck”. They did this in less than one tenth the time and with a fraction of the logic gates that other methods require. Petersen tested the networks on programmable computer chip FPGAs that can emulate many different patterns of logic gates. Implementing the networks into non-programmable ASICs would further reduce costs, as programmable chips require more components to achieve their flexibility.

Farinaz Koushanfar, professor of electrical and computing engineering at the University of California San Diego, is not convinced that logic-gate network will perform well when faced with real problems. She says, “It’s cute idea but I’m unsure how well it scales.” She points out that the relaxation strategy can only be used to train the logic-gate network approximately. Approximations are not always accurate. Koushanfar says this hasn’t been a problem yet, but it could become more problematic as networks grow.

Petersen, however, is ambitious. He wants to push the capabilities of his logic gate networks and hopes to, eventually, create what he refers to as a “hardware basis model”. A powerful, general purpose logic-gate network could be mass produced directly on computer chip, and these chips could be integrated in devices like computers and personal phones. Petersen believes that this could have huge energy benefits. If these networks could reconstruct photos and video from low-resolution data, for example then far less data will need to be sent between personal devices and servers.

Petersen admits that logic gate networks will never be able to compete with traditional neural network performance, but this is not his goal. It should be enough to make something that works and is as efficient as it can be. “It will not be the best model,” says he. “But it should also be the cheapest.”

The next generation of neural network could be embedded in hardware

Nvidia may not only be building the Nintendo Switch 2

OpenAI: The Power and the Pride

Instacart names Chris Rogers, chief business officer, as its new CEO

The Download: OpenAI and the Making of Magnesium

Recomended

Nvidia may not only be building the Nintendo Switch 2

OpenAI: The Power and the Pride

Instacart names Chris Rogers, chief business officer, as its new CEO

The Download: OpenAI and the Making of Magnesium

Atomic Canyon wants to be ChatGPT for the nuclear industry

Elon Musk’s Grok bot is banned by a quarter European firms