O3 is here, is general artificial intelligence really within reach?
Article source: Ask nextquestion
Image source: Generated by AI
"How long will it take for machines to truly possess the cognitive capabilities of the human brain?” quot;This fundamental issue that has plagued the field of artificial intelligence for decades will once again become the focus of the global scientific and technological community at the end of 2024.
As artificial intelligence continues to break through in specific areas such as image recognition and natural language processing, a more challenging goal remains at hand: allowing machines to gain inferences from one example, the ability to reason on abstract concepts, and plan and deploy cognitive resources like humans.
Amid this ongoing debate about the limits of machine intelligence, OpenAI’s recently released new artificial intelligence system has injected new variables into this traditional proposition.The San Francisco-based AI giant, which gained fame for developing ChatGPT, released a new generation of Large Language Model (LLM) systems known as O1 in September. Just this month, the industry reported that OpenAI is developing a more powerful system codenamed O3, which is called“Prelude to General Artificial Intelligence (AGI)”The project has attracted a new round of attention. Compared with previous AI models, the technical route from O1 to O3 shows an operating mechanism that is closer to the way human cognition is. These breakthrough developments are redefining our understanding of the potential of artificial intelligence.
Once AGI is realized, it may bring unprecedented breakthroughs to mankind: from the governance of climate change, to the prevention and control of epidemics, to the overcoming of stubborn diseases such as cancer and Alzheimer’s disease. However, such a huge force may also bring uncertainty and pose potential risks to mankind. Yoshu Bengio, a deep learning researcher at the University of Montreal in Canada, said: “Human misuse or loss of control of AI can lead to serious consequences.”
The revolutionary development of LLM in recent years has sparked speculation that AGI may be on the horizon. But some researchers say that given the way LLMs are built and trained, they are not enough on their own to achieve AGI, and “some key parts are still missing.”
There is no doubt that the issue of AGI is more urgent and important today than ever before.“For most of my life, I have believed that it is unconventional for people to talk about AGI,”said ao ati, a computer scientist at Arizona State University“But now, everyone is talking about it. You can’t call everyone ‘unconventional.'”
Why did the AGI debate turn
The term “General Artificial Intelligence”(AGI) first entered the mainstream around 2007, when it was launched as the title of a book of the same name by AI researchers Ben Goertzel and Cassio Pennachin. Although the exact meaning of this term is unclear, it usually refers to AI systems with similar human-like reasoning and generalization capabilities. For most of the history of artificial intelligence development, it was widely believed that AGI remained an unachieved goal. For example, the AlphaGo program developed by Google DeepMind is specifically designed for Go games. It has defeated top human chess players in the field of Go, but its superhuman abilities are limited to Go, which means that this is the only area where it excels.
LLM[1]’s new capabilities are revolutionizing this situation. Like the human brain, LLM has a wide range of capabilities, which has led some researchers to seriously considerSome form of general artificial intelligence may be on the horizon[1], even already exists.
The breadth of this ability is even more shocking when you consider that researchers only partially understand how LLM achieves this goal. LLM is a neural network that is inspired largely by the human brain. It consists of layered artificial neurons (or computing units), and the connection strength between these layers is represented by adjustable parameters. During training, powerful LLMs-such as o1, Claude (developed by Anthropic) and Google’s Gemini-rely on a method called“The next word prediction(next token prediction)”method. In this method, the model repeatedly inputs segmented text samples (i.e. lexical chunks). These words can be an entire word or just a group of characters. The last word in the sequence is hidden or “masked” and the model is required to predict it. The training algorithm then compares the prediction with the masked lexical elements and adjusts the parameters of the model so that it can make a better prediction next time.
This process repeats over and over again-often using billions of dialogue fragments, scientific text, and programming code-until the model can reliably predict hidden terms. At this stage, the model parameters have captured the statistical structure of the training data and the knowledge contained therein. The parameters are then fixed and the model uses them to generate predictions for new queries or “hints” that do not necessarily appear in its training data, a process called“Reasoning”。
The use of a neural network architecture called “Transformer” has enabled LLM’s capabilities to significantly exceed previous achievements.Transformer allows the model to learn that certain lexical elements have a particularly strong influence on others, even if they are far apart in the text sample.This allows LLM to parse language in ways that seem to mimic humans-for example, distinguishing between two meanings of the word “bank” in the following sentence: “When the bank overflows, floods damage the bank’s ATM, making it impossible to withdraw money.”
This method has achieved remarkable results in a variety of application scenarios, such as generating computer programs to solve problems described in natural language, summarizing academic articles, and answering mathematical questions.
As LLM increases in size, some new capabilities emerge-if LLM is big enough, AGI may also emerge. one example is“Thought Chain (CoT) Tips”。This approach includes demonstrating to LLM how to break complex problems into smaller steps to solve them, or directly prompting them to solve problems step by step. However,For smaller-scale LLMs, this process does not have significant results.
LLM’s capabilities
According to OpenAI,”CoT Tips” have been integrated into the operating mechanism of o1 and become a core component of its powerful functions. Francois Chollet, a former Google AI researcher, pointed out that o1 is equipped with a CoT generator that can generate a large number of CoT prompts for user queries and filter out the best prompts through specific mechanisms.
During training, o1 not only learns how to predict the next word, but also learns the ability to select the best CoT prompt for a specific query. OpenAI saidThanks to the introduction of CoT reasoning, o1-preview(Advanced version of o1)At the International Mathematical Olympiad(A world-renowned mathematics competition for high school students)83% of the questions were correctly solved in the pre-selection exam。In contrast, OpenAI’s previous most powerful model, GPT-4o, had a correct rate of only 13% in the same exam.
However, despite the impressive complexity of o1, Kambhampati and Chollet believe that it still has obvious limitations and does not meet AGI standards.
For example, in tasks that required multi-step planning, Kambhampati’s team found that althoughAlthough o1 performed well in planned tasks up to 16 steps, its performance dropped rapidly when task complexity increased to 20 to 40 steps.[2]。
Chollet found similar limitations when challenging o1-preview. He designed an abstract reasoning and generalization test to assess the evolution towards AGI. Tests take the form of visual puzzles, and solving these problems requires looking at examples to infer abstract rules and using them to solve similar new problems. The results show that it is obviously easier for humans to do this. Chollet further pointed out:“LLM can’t really adapt to new things because they basically don’t have the ability to dynamically recombine their knowledge to adapt to the new environment.”
Can LLM move towards AGI?
So, is LLM capable of finally moving towards AGI?
It is worth noting that the underlying Transformer architecture is not only capable of processing text, but also suitable for other types of information (such as images and audio),The premise is that an appropriate lexicon method can be designed for this data.Andrew Wilson, who studies machine learning at New York University, and his team point out that this may be related to a feature of sharing different types of data: the lower “Kolmogorov complexity” of these datasets, which means the shorter length of the shortest computer program needed to generate the data.[3]
The study also found that Transformer is particularly good at learning data patterns with low Kolmogorov complexity, and this ability will continue to increase as the model size increases.Transformer’s ability to model multiple possibilities increases the probability that the training algorithm will find appropriate solutions to the problem, and this “expressiveness” will be further enhanced as the model size grows.Wilson said these are “some of the key elements needed for universal learning.”
Although Wilson believes AGI is still out of reach, he saidLLM and other AI systems that use the Transformer architecture already have some key features similar to AGI behavior.
However, Transformer-based LLM also shows some inherent limitations.
First,The data resources needed to train models are gradually exhausted。The Epoch AI Institute in San Francisco, which focuses on AI trend research, estimates [4] that publicly available training text datasets may be exhausted between 2026 and 2032.
Also,Although LLM continues to grow in size, its performance has not improved as much as before。It is unclear whether this is related to a decrease in novelty in the data (since most of the data has already been used), or for other unknown reasons. The latter is a bad sign for LLM.
RaiaHadsell, vice president of research at Google DeepMind in London, raised another question. She pointed out thatDespite the power of Transformer-based LLM, its single goal-predicting the next word-is too limited to achieve true AGI.She suggested that building models that can generate solutions in one go or in a holistic manner may be closer to achieving AGI. The algorithms used to build such models have been used in some existing non-LLM systems, such as OpenAI’s DALL-E, which is capable of generating realistic or even surreal images based on natural language descriptions. However, these systems cannot match the broad capabilities of LLM.
Building AI’s world model
Neuroscientists provide important intuitive insights into breakthrough technologies on how to drive the development of AGI. They believe thatThe root cause of human intelligence lies in the brain’s ability to build a “model of the world”, an internal representation of the surrounding environment.This model supports planning and reasoning by simulating different courses of action and predicting their consequences. In addition, by simulating multiple scenarios, this model can generalize skills learned in specific areas into completely new tasks.
Some research reports claim that there is evidence that a preliminary world model may have formed within LLM. In a study [5], Wes Gurnee and Max Tegmark of the Massachusetts Institute of Technology found thatWhen LLM uses a data set that contains information about multiple parts of the world for training, with widespread application, LLM can internally form corresponding representations of the world around it.However, other researchers point out that there is currently no evidence that these LLMs use the world as a model for simulation or causality learning.
In another study [6], Harvard University computer scientist Kenneth Li and colleagues found that a small LLM used a player’s footwork while playing ello as training data,Learned the ability to internally represent the state of the chessboard and used this representation to correctly predict the next legal move.。
However, other research suggests that the models of the world built by today’s AI systems may not be reliable. In one study [7], Harvard University computer scientist Keyon Vafa and his team used turn data sets from New York City taxi trips to train a Transformer-based model that completed the task with close to 100% accuracy. By analyzing the sequence of turns generated by the model, the researchers found that the model relied on an internal map to complete predictions. However,This internal map bears little resemblance to the actual map of Manhattan.
▷AI’s impossible streets. Source: [7]
“The map contains physically impossible street directions, as well as elevated roads that span other streets,” Vafa noted. When the researchers adjusted the test data to include unexpected detours that did not appear in the training data, the model was unable to predict the next turn, indicating that it was less able to adapt to new situations.
The importance of feedback
ep George, a member of Google DeepMind’s AGI research team in Mountain View, California, points out thatToday’s LLMs lack one key feature: internal feedback。The human brain has extensive feedback connections that allow information to flow in both directions between neuron layers. This mechanism allows information from the sensory system to flow to higher levels of the brain to create a model of the world that reflects the environment. At the same time, the information of the world model can also be transmitted downward, guiding the acquisition of further sensory information. This two-way process is crucial to perception, for example, where the brain uses world models to infer potential causes for sensory input. In addition, these processes support planning, using world models to simulate different courses of action.
However,Current LLM can only use feedback in additional ways。For example, in o1, the internal CoT prompt mechanism helps answer queries by generating prompts and feeds them back to the LLM before the answer is finally generated. But as Chollet’s tests show, this mechanism does not ensure the reliability of abstract reasoning capabilities.
Kambhampati and other researchersTry adding an external module called a validator to LLM。These modules check the answers generated by LLM in specific contexts, such as verifying the feasibility of travel plans. If the answer is not perfect, the validator will ask LLM to rerun the query [8]. ati’s team found thatLLMs with external validators perform significantly better than ordinary LLMs when generating travel plans, but researchers need to design a dedicated validator for each task.“There is no universal validator,” Kambhampati pointed out. In contrast, AGI systems may need to build their own validators to adapt to different situations, just as humans use abstract rules to ensure correct reasoning in new tasks.
Research on developing new AI systems based on these ideas is still in its preliminary stages. For example, Bengio is exploring how to build AI systems that are different from current Transformer-based architectures. He proposed a method called“Generating a streaming network(generative flow networks)”The method aims to enable a single AI system to not only build a model of the world, but also use these models to complete reasoning and planning.
Another major obstacle facing LLM is itsHuge demand for data。Karl Friston, a theoretical neuroscientist at University College London, proposed thatFuture AI systems could increase efficiency by autonomously deciding how much data to sample from the environment, rather than simply ingesting all available data。He believes that this autonomy may be necessary for AGI. “This true autonomy cannot yet be reflected in current large-scale language models or generative AI. If some kind of AI can achieve a certain degree of autonomous choice, I think this will be a critical step towards AGI.”
AI systems capable of building effective world models and integrating feedback loops may significantly reduce reliance on external data.These systems can run internal simulations to make counterfactual assumptions and use them to achieve understanding, reasoning, and planning. For example, in 2018, researchers David Ha and Jürgen Schmidhuber reported [9] that they had developed a neural network that efficiently built a world model of an artificial environment and used this model to train AI to drive virtual racing cars.
If you are uncomfortable with the concept of an autonomous AI system, you are not alone. In addition to studying how to build AGI, Bengio also actively advocatesIntroducing security into the design and supervision of AI systems。He believes that research should focus on training models that can ensure the safety of one’s own behavior, such as establishing mechanisms to calculate the probability that the model violates certain security constraints and refuse to take action when the probability is too high. In addition, governments need to ensure the safe use of AI. “We need a democratic process to ensure that individuals, companies and even the military use and develop AI in a way that is safe for the public.”
So, is it possible to implement AGI? Computer scientists see no reason not to think so. “There are no theoretical obstacles,” George said. Melanie Mitchell, a computer scientist at the Santa Fe Institute, agreed: “Humans and some other animals have shown that this is possible. In principle,I don’t think there are any particular differences between biological systems and systems made of other materials that prevent non-biological systems from becoming intelligent.”
Despite this, there is still a lack of consensus in the academic community on the timing of AGI’s implementation: predictions range from within a few years to at least ten years later. George pointed out that if the AGI system is created, we will confirm its existence through its behavior. Chollet suspects its arrival will be very low-key:”When AGI arrives, it may not be as obvious or stormy as you think.The full potential of AGI will take time to gradually emerge. It will be invented first, then expanded and applied before it will ultimately truly change the world.”