Can artificial intelligence robots make themselves learn to be "scientists" who are more cattle than Einstein? Some people say that overcoming the limitations of AI is to establish "a bridge between computer science and biology."
The Science and Technology Review website published an article recently, introducing the development trend of deep learning and its limitations. Although deep learning is already a current craze and has achieved a lot of results, industry insiders point out that the current deep learning is like "engineering before the emergence of physics." The following is the original summary:
Every advancement in the field of artificial intelligence (AI) depends on a breakthrough 30 years ago. To maintain the rhythm of AI progress, we need to break through some major limitations in this field.
Einstein in the AI ​​fieldVector InsTItute, located in downtown Toronto, Canada, will open this fall and is designed to be the global AI center stage. US and Canadian companies (such as Google (Weibo), Uber and Nvidia) will sponsor commercial AI technology efforts at this institute.
The influx of funds is faster than what the center's co-founder, Jordan Jacobs, envisioned. Two other co-founders of the center surveyed companies in the Toronto area and found that the region's demand for AI experts is 10 times that of Canada's annual number of experts. The world is experiencing a wave of deep learning, and the institute hopes to stand at the center of this trend – focus on this technology, teach it, improve it, and apply it. Data centers are under construction, startups are coming, and students are entering.
Geoffrey Hinton, the father of deep learning, also lives in Toronto. Jacobs said: "In another 30 years, when we look back, we will say that Hinton is the AI ​​and Einstein in the field of deep learning."
Hinton's disciples are in charge of AI Labs at Apple, Facebook and OpenAI, and Sinton himself is the chief scientist of the Google Brain AI team. In fact, almost every achievement of AI in the last decade—translation, speech recognition, image recognition, and gameplay—is inseparable from the foundation laid by Sinton.
The main idea of ​​in-depth study was actually put forward 30 years ago. Hinton and his colleagues, David Rumelhart and Ronald Williams, published a groundbreaking essay in 1986 detailing what is called "back propagation." Technology. In the words of Jon Cohen, a computational psychologist at Princeton University, this technique is "the basis of all deep learning."
This mid-1980s article describes how to train multi-level neural networks. It laid the foundation for the development and progress of the AI ​​field in the past decade.
Deep learning is back propagationFrom a certain point of view, AI is deep learning, and deep learning is back propagation. You may feel incredible, how a technology has been squatting for so long, and then suddenly there has been an explosive rise. One point of view is: Maybe we are not at the beginning of a revolution, but at the end of it.
Sinton is from the UK, worked at Carnegie Mellon University in Pittsburgh, and moved to Toronto in the 1980s. He likes the atmosphere of the city.
Hinton said that he recently made a major breakthrough in a project, "I found a very good junior engineer and worked with me," the female engineer named Sara O Safu, an Iranian who applied in the United States. The work visa was rejected. Google accepted her at the office in Toronto.
In the 1980s, Hinton was already a neuroscientist. Neural networks are a greatly simplified model of brain neurons and synaptic networks. Although the earliest neural network "Perceptron" was developed in the 1950s, it was also hailed as the first step toward human-computer intelligence. But in the 1980s, the industry firmly believed that neural networks were the dead end of AI research.
In 1969, Marvin Minsky and Seymour Papert of the Massachusetts Institute of Technology in a book called Perceptron, mathematically proved that such a network can only perform the most basic functions. This network has only two layers of neurons, one input layer and one output layer. If a network has more layers between input and output neurons, it can theoretically solve many different problems, but no one knows how to train them, so in practice, these neural networks are useless. In addition to several people like Hinton, Perceptron made most people completely give up the neural network.
In 1986, Hinton made a breakthrough, showing that backpropagation can train a deep neural network (more than two or three layers of neural networks). But it took another 26 years, and the computing power developed to the point where you can make good use of this breakthrough. A paper published by Sinton and his two students in 2012 showed that the deep neural network of back-propagation training defeated the most advanced systems in image recognition. "Deep learning" has since become a craze. From the outside world, AI seems to have flourished overnight. But for Sinton, this is a late outbreak.
Principle of neural networkNeural networks are often described as a multi-layered sandwich stacked in layers. These layers contain artificial neurons, which are tiny units of computation that can be stimulated (as if a real neuron would be fired) and then pass the excitability to other neurons to which it is connected. The excitability of neurons is represented by numbers, such as 0.13 or 32.39. In addition, there is a key figure in the connection between every two neurons that determines how much excitement can be passed from one neuron to another. This number is the degree of force that simulates synapses between brain neurons. When this number is high, it means that the connection between the two neurons is stronger, and more excitement can be transmitted to the other party.
One of the most successful applications of deep neural networks is in image recognition, where the team developed a program to determine if there are hot dogs in the image. Such a program was impossible to achieve ten years ago. The first step in developing such a program is to find a photo. For the sake of simplicity, you can use a black and white image, 100 pixels wide and 100 pixels high. You enter this image into the neural network—that is, set the excitability of each simulated neuron in the input layer to match the brightness of each pixel. The bottom layer of this multi-layered sandwich is 10,000 neurons (100x100) representing the brightness of each pixel in the image.
Then, you connect this layer of neurons to another layer of neurons above it (with thousands of neurons), and then continue to connect a layer of neurons (and thousands of neurons), and so on. Finally, at the top of the sandwich is the output layer, which has only two neurons, one for "with hot dogs" and the other for "no hot dogs." The idea is to let the neural network learn to activate the "hot dog" neurons only when there are hot dogs in the picture. Only when there are no hot dogs in the picture will the "no hot dogs" neurons be activated. Backpropagation is the way to do this.
How to use backpropagation technologyBackpropagation itself is very simple, although it works best when there is a lot of data available. That's why big data is so important in AI -- and why Facebook and Google are so eager for data.
When training a neural network, you need to use millions of pictures, some with hot dogs and some without. The trick is that pictures with hot dogs are marked as hot dogs. In an initial neural network, the weight of connections between neurons (representing how much excitability each connection delivers) may be a random number, as if the synapse of the brain has not been adjusted. The goal of backpropagation is to change these weights so that the neural network can get good results: when you enter the hot dog's picture into the lowest layer, the topmost "hot dog" neurons will eventually become excited.
Suppose the first training picture you selected is a piano. You convert the pixel intensity in this 100x100 image to 10,000 numbers, which is exactly the same as the 10,000 neurons in the bottom layer of the network. The degree of excitement is then filtered on this network based on the connection weights between adjacent neuron layers, reaching the final level to determine if there are two neurons of the hot dog in the picture. Since the picture is a piano, in an ideal situation, a "hot dog" neuron should get a zero, while a "no hot dog" neuron should get a very high number. But we assume that this neural network is not working well and draws a wrong conclusion about this photo. Then you use backpropagation technology to re-weight each connection in the network to fix the error.
It works by starting with the last two neurons and figuring out how bad they are: what the number of excitement should be, how much is actually, and how big is the difference? When doing this, you check what each connection to these neurons (and those neurons in the next layer) and figure out how much they contribute to the error. You have been analyzing this way until the first layer, which is the bottom layer of the network. At this point, you know how much each individual connection contributes to the overall error. Finally, you can modify each weight in the direction that minimizes the overall error. This technique is called "backpropagation" because you start with the output and analyze the error in the opposite direction.
The magic and stupidity of neural networksThe wonderful thing is that when you have tens of millions, even billions of images, and then operate in this way, the neural network becomes very good at identifying whether there are hot dogs in the image. Even more amazingly, the various layers in the image recognition network are beginning to "view" the image in the same way as the human visual system. That is, the first layer may detect the edge - when there is an edge, its neurons are fired, when there is no edge, it will not fire; the upper layer may detect a set of edges, such as detecting One corner; then the upper layer may begin to see the shape; the upper layer may begin to recognize things like "open bread" or "open bread." In other words, the programmer does not need to actively program this way, and the neural network will form a hierarchical level of its own.
The thing to remember is that although these "deep learning" systems sometimes look smart, they are still stupid. If you have a picture showing a bunch of donuts on the table and the program can automatically mark it as "a bunch of donuts piled on the table", you might think this app is smart. But when the same program sees a picture of a girl brushing her teeth, it will be labeled "Boy holding a baseball bat" and you will find that it lacks an understanding of the world.
Neural networks are just unconscious fuzzy pattern recognizers that you can integrate into almost any type of software. But they contain a lot of intelligence and are easily deceived. If you change a single pixel, a deep neural network that recognizes the image may be completely silly. While discovering more ways to apply deep learning, we are also frequently discovering its limitations. Autonomously driven cars may not be able to cope with road conditions that have never been seen before. The machine also cannot parse sentences that require common sense to understand.
To some extent, deep learning mimics what happens in the human brain, but the degree of imitation is very shallow – which may explain why its intelligence sometimes seems limited. In fact, backpropagation does not decipher the mind by deeply exploring the brain and interpreting the thought itself. Its foundation is actually the learning mode of how the animals use the trial and error method in the conditioned reflex experiment. Many of its great leaps have not incorporated some new insights into neuroscience, but technological improvements that have been accumulated in mathematics and engineering over the years. Our understanding of intelligence is only a drop in the ocean compared to what we don't know yet.
"Engineering before physics"David Duvenaud, an assistant professor at the University of Toronto, says that deep learning is like "engineering before physics." He explained this: "Someone wrote an article saying, 'I made this bridge!' Another person made a paper: 'I made this bridge, it fell down - then I added The pillar, it stands up. 'Then the pillar became a big hit. Some people thought of using the bridge arch, 'the bridge arch is great!' But after you have the physics, you understand how to build a bridge can not fall, why He said that until recently, the artificial intelligence community began to enter the stage of actually understanding it.
Sinton believes that overcoming the limitations of AI is to establish a "bridge between computer science and biology." In this view, backpropagation is a "bio-inspired calculation" victory. Its inspiration is not from engineering, but psychology. Now, Sinton is exploring a new approach.
The current neural network is composed of huge planar layers, but in the human neocortex, the real neurons are not only horizontally stratified, but also vertically arranged in columns. Hinton believes that he knows what the use of these columns is—for example, even if our perspective changes, we can identify objects—so he is building a similar “capsule†to test this theory. So far, capsules have not greatly improved the performance of neural networks. However, the back-propagation he proposed 30 years ago did not show amazing results until recently.
"It may not work for a while," he said of the capsule theory.