Sunday, December 22, 2024

2024 physics Nobel for work on artificial neural networks | Explained

Must read

Representative illustration.
| Photo Credit: Growtika/Unsplash

The story so far: On October 8, John Hopfield and Geoffrey Hinton won the 2024 Nobel Prize for physics “for foundational discoveries and inventions that enable machine learning with artificial neural networks”. Their work lies at the roots of a large tree of work, the newest branches of which we see today as artificially intelligent (AI) apps like ChatGPT.

What is AI?

An accessible AI today is likely to be an implementation of an artificial neural network (ANN) — a collection of nodes designed to operate like networks of neurons in animal brains. Each node is a site where some input data is processed according to fixed rules to produce an output. A connection between nodes allows them to transfer input and output signals to each other. Stacking multiple layers of nodes, with each layer performing a specific task with great attention to detail, creates a machine capable of deep learning.

The popular imagination of AI today is in terms of computing: AI represents what computers like those in smartphones can do today that they weren’t able to yesterday. These abilities are also beginning to surpass what humans are capable of. So it is a pleasant irony that the foundations of contemporary AI, for which Hopfield and Hinton received this year’s physics Nobel Prize, are in machines that started off doing things humans were better at — pattern recognition — and based on ideas in statistical physics, neurobiology, and cognitive psychology.

What is the Hopfield network?

In 1949, Canadian psychologist Donald Hebb introduced a neuropsychological theory of learning to explain the ability of connections between neurons to strengthen or weaken. Hebb posited that a connection, or synapse, between two neurons becomes more efficient if the neurons constantly talk to each other. In 1983, Hopfield developed an ANN whose nodes used Hebb’s postulate to learn by association. For example, if a node is exposed to many texts, one set in English and the other its Tamil translation, it could use Hebbian learning to conclude “hand” and “kai” are synonymous because they appear together most often.

Another distinguishing feature of a Hopfield network is information storage. When the network is ‘taught’ an image, it stores the visual in a ‘low-energy state’ created by adjusting the strengths of the nodes’ connections. When the network encounters a noisy version of the image, it produces the denoised version by progressively moving it to the same low-energy state. The use of ‘energy’ here is an echo of the fact that the Hopfield network is similar in form and function to models researchers have used to understand materials called spin glasses. A low-energy state of a Hopfield network — which corresponds to its output — could map to the low-energy state of a spin glass modelled by the same rules.

Hopfield’s mapping was a considerable feat because it allowed researchers to translate ideas from statistical physics, neuropsychology, and biology to a form of cognition.

What is a Boltzmann machine?

Hinton’s share of the Nobel Prize is due to his hand in developing the first deep-learning machines. But as with Hopfield standing on Hebb’s shoulders, Hinton stood on those of Ludwig Boltzmann, the Austrian physicist who developed statistical mechanics. In 1872, Boltzmann published an equation to predict, say, the possible behaviours of a tub of fluid with one end hotter than the other. Whereas the first guess of a simple logic would be that all the possible states this system can take would be equally probable, Boltzmann’s equation predicts that some states are more probable than others because the system’s energy prefers them.

In the mid 1980s, Hinton and his peers, notably Terry Sejnowski, developed an ANN with a tendency to move towards some outcomes over others by using Boltzmann’s equation to process its inputs. Their network had a set of visible nodes, which could input and output information, and a set of hidden nodes that only interacted with other nodes. The visible nodes worked like a Hopfield network whereas the hidden nodes modelled new possibilities using Boltzmann’s equation. This was the dawn of generative AI.

In another breakthrough in the 2000s, Hinton & co. devised a form of the Boltzmann machine where the hidden nodes were connected only to visible nodes, and vice versa. These restricted Boltzmann machines (RBMs) could learn more efficiently, especially using the contrastive divergence algorithm Hinton et al. developed. Hinton, Simon Osindero, and Yee-Whye Teh also found that ‘layers’ of ANNs could be trained using RBMs and then stacked to create a deep learning model.

Where are ANNs today?

Technologies evolve through successive levels of abstraction. The individual computer of the late 1980s is today part of the cloud, a distributed network of computing sites linked by different types of data networks and managed using both software and hardware controls. ANNs are the product of a similar abstraction, which Hopfield and Hinton helped achieve, and have in turn been further transformed. Thus they are within the reach of millions of people but also less resemble their ancestors.

Advances in this area have benefited from the work of multiple teams and ideas, so much so that drawing a straight line from Hopfield’s and Hinton’s work to ChatGPT is impossible. One notable new form of ANN is the transformer, a two-part neural network that encodes and then decodes information, with valuable applications in object (including facial) detection and recognition. Other important developments include backpropagation, a technique that allows unsupervised ANNs to upgrade themselves as they learn, and the long short-term memory that enables ANNs to ‘remember’ some information for a fixed number of steps.

ANNs are also on our minds. Hinton has said he’s “worried the overall consequence … might be systems more intelligent than us that eventually take control.” He left Google in 2023 to spread awareness of AI’s risks. Hopfield has expressed similar sentiments. Why do it then? Presumably because the tree is big and it’s impossible to see the branches sitting at the roots. In an essay published in the journal Physical Biology exactly a decade before Hopfield won his Nobel, he recalled that line from Casablanca: “’Of all the gin joints, in all the towns, in all the world, she walks into mine.’ Own the right gin joint.” Hopfield did and that was that.

Latest article