The Evolution of AI: From Early Days to Future Trends

The Rise of AI

The emergence of ChatGPT feels like it was just yesterday. At the end of 2022, that simple chat window allowed ordinary people worldwide to first touch the contours of “artificial intelligence.” Many recall the shockwaves caused by AlphaGo’s victory over Go master Lee Sedol in 2016, but this time, AI based on a new large model architecture has arrived in an unprecedented “personalized” manner: it can converse, create, and almost act like a true partner.

The resulting AI boom continues today. Phenomenal products and astonishing “ChatGPT moments” keep emerging, with “already smart AI” being iterated upon by even smarter AI at a dizzying pace, infiltrating our daily lives and mental worlds. The consensus is clear: “AI will change everything.”

The sentiment in the industry is more complex. Engineers are ambitious, predicting that AI will achieve human-like work. However, the required computing power, talent, and time seem to be an endless pit. In the hottest investment market, the U.S. is projected to invest about $670 billion in AI this year, accounting for 2.1% of its GDP. Although this figure is lower than the 7% during the historical British railway bubble, the growth rate is still astonishing.

In the midst of this surging wave, it is necessary to look back at the origins of AI, how it has transformed a winding path into today’s vast and complex road network, and to speed along these roads with readers, feeling the most striking and densely populated avenues, contemplating what future they will lead to.

On May 10, at the Embodied Intelligence Exhibition and Application Promotion Center in Hangzhou, the humanoid robot G1 from Yushu Technology performed a dance. Source: Visual China

Seventy Years of Journey: AI’s “Two Falls and Three Risings”

The story of artificial intelligence can be traced back 70 years. Professor Wu Fei, an AI expert and Dean of the Undergraduate School at Zhejiang University, summarizes its development as “two falls and three risings”: “Whenever technology fails to solve real problems, it falls into a valley; once a new breakthrough is found, it quickly rises again.”

At the 1956 Dartmouth Conference, scholars like McCarthy and Minsky spent two months discussing the question: Can machines use language, form concepts, solve problems, and continuously improve themselves like humans? Although they did not reach a conclusion, they gave the field a definitive name: “Artificial Intelligence (AI).”

This was followed by a wave of optimistic explorations: the “Logic Theorist” that could prove mathematical theorems, the industrial robot Unimate, and the ancestor of chat programs, ELIZA. These early attempts showcased the potential of machines to process language and problems.

However, early computer memory was measured in kilobytes, and the mainstream technology of “symbolism” relied on manually written vast rules. The complexity of reality far exceeded imagination, and the rules could never be fully written. Thus, when the expectation of achieving human intelligence within a few years fell through, criticism arose: this felt more like “playing house,” as AI could only operate in artificially designed “toy domains.” The publication of the book “Perceptrons” in 1969 pointed directly at the underlying limitations of simple neural networks. Funding quickly withdrew, and AI fell into its first winter in the 1970s.

In the valley, researchers turned to more pragmatic paths. In the 1980s, AI rose again with “expert systems”. These systems encoded the knowledge of human experts into databases and successfully applied them in specific fields like medical diagnosis, gaining favor from enterprises—specialized Lisp machines for running such programs sold well at one time. Japan also invested hundreds of millions of dollars to launch the “Fifth Generation Computer” project, aiming to create AI machines capable of dialogue and reasoning.

“Unfortunately, ’expert systems’ were fragile and difficult to maintain, and could not handle situations outside the rules,” Wu Fei said. The popularity of personal computers caused the expensive dedicated AI hardware market to collapse suddenly in 1987, leading to AI’s second winter.

The setbacks allowed the field to settle: since passive knowledge infusion was limited, could machines learn proactively? “Connectionism,” centered on neural networks, quietly gained strength and brought a turning point in the 21st century. In 2006, Hinton overcame the training difficulties of deep neural networks, and in 2012, his team’s AlexNet gained fame in an image recognition competition—proving to the world that machines based on deep learning could grasp patterns from data on their own.

Algorithms, computing power, and big data jointly gave birth to the third wave of AI. In 2016, AlphaGo defeated a Go master, igniting global venture capital enthusiasm. However, this round of applications relied on small models tailored for specific tasks, emphasizing the creation of “specialists.” People continued to ponder the essence of AI; it should not just be a specialized tool replacing simple labor. A common metaphor in the industry is that AI should be like the “seven-league boots” in fairy tales, allowing one step to cover seven leagues (about 39 kilometers), significantly expanding the boundaries of human capability.

Thus, in 2017, the era of large models based on the Transformer architecture arrived—training models with massive parameters on vast amounts of data, demonstrating unprecedented understanding, reasoning, and generation capabilities. Especially with generative AI represented by ChatGPT, the world intuitively felt this “great power brings great miracles.” At the 2023 World Artificial Intelligence Conference, a technician recalled that when GPT-1 first appeared in 2018, it was seen as unconventional, until the emergence of GPT-3.5 at the end of 2022 made everyone suddenly realize the power of “generalists.”

On April 9, the 2026 China International Medical Device Expo was held in Shanghai, showcasing products like intelligent medical devices. The image shows visitors viewing the medical AI from Damo Academy. Source: Visual China

Technical Paradigm: From “Dialogue” to “Doing”

The AI we speak of today is a system composed of models, data, and computing power. Large models are the product of these three elements reaching a certain scale, initially demonstrating the potential for “plug-and-play” generality.

Jiang Tianyi, head of technology at NetEase’s CodeWave, summarizes the evolution of large models into three leaps: from perceptual intelligence that can “see” and “hear,” to generative intelligence represented by GPT, and now entering the current agency intelligence phase—AI is transitioning from “a talking encyclopedia” to “a proactive housekeeper” that learns to understand complex instructions, plans steps, and executes tasks.

“The emergence of phenomenal agents like ‘Lobster’ is a result of technological accumulation reaching a threshold, relying on breakthroughs in underlying models and requiring engineering implementation, a product of ‘product-technology’ alignment,” Jiang Tianyi said. Developers’ focus has shifted from “how to write code” to “how to clearly and structurally define problems for AI.” He expressed his feelings: in 2022, the focus was on learning how to make AI generate, but now the research is on how to harness this “fast horse” AI with “reins.”

All of this stems from the continuous improvement of machines’ ability to deduce patterns from vast data, but the initial path of “piling up computing power and scale” has encountered diminishing marginal returns.

In this context, China has taken a different step: shifting from “scaling up” to “densifying”—pursuing lighter models, smarter architectures, and lower prices. This is akin to focusing attention in a noisy venue from “listening to all the noise” to “capturing key speeches.”

Industry competition has thus shifted from a “battle of hundreds of models” to a “marathon” that delves into scenarios and extracts value. The China Academy of Information and Communications Technology’s “AI Industry Development Research Report (2025)” has pointed out that the number of foundational models continues to converge, with application effectiveness becoming the focus of attention. For instance, the large model “Six Little Tigers” includes Baichuan Intelligence, which focuses on healthcare, and Zero One Wanwu, which shifts to customized solutions for enterprises.

As AI advances rapidly in the digital world, its “jagged intelligence” flaws are gradually exposed—capable of winning gold in international mathematics competitions but unable to read an analog clock—while climbing the peak of human wisdom, it may also be trapped in elementary school-level common sense.

In light of this limitation, physical AI is seen as the next key direction.

Compared to existing large models that excel at “talking,” physical AI excels at “doing.” This concept was first proposed in 2024. Earlier this year, NVIDIA founder Jensen Huang asserted that the “ChatGPT moment” for physical AI has arrived, showcasing a “vision-language-action” model that allows robots to understand vague instructions like “clean the table,” autonomously determining what to pick up and where to place it.

Not only robots, but also the Genos genome model, jointly developed by BGI Life Sciences and Zhi Jiang Laboratory, promotes AI’s evolution from assisting interpretation to autonomous decision-making by directly learning human gene sequences, “like a biological version of GPT,” said Chen Duoyuan, assistant director of BGI Life Sciences. “AI must understand the world, not just text.”

However, developing physical AI poses core challenges, as physical “experience” cannot be directly obtained from existing data and must be generated through real interactions with the environment. The JEPA architecture proposed by Turing Award winner Yang Likun is based on this idea: allowing AI to perceive causal logic through active observation and interaction, much like a baby.

Of course, physical AI does not replace existing large models but moves towards integration. There is a consensus in the industry: future AI will not only be the “thinker” of the digital world but also the “doer” of the physical world, and even the “explorer” of life’s mysteries.

Practical Applications: How to Solve Real Problems

Exciting technologies ultimately need to land in industries to generate real value.

Sequoia Capital, an early investor in OpenAI, has shifted more focus to vertical fields and application layers after 2023. Its partner Bota candidly stated, “Our money is not for paying exorbitant training costs, but for investing in companies that ‘use models.’” This statement clearly points to a trend where the competitive focus has shifted from “who can build the strongest model” to “who can achieve the critical leap from technology to productivity.”

Looking back domestically, a set of numbers outlines the development of AI in China by 2025: the cumulative global downloads of domestic open-source large models exceed 10 billion; the number of AI patents in China ranks first globally, accounting for about 60%… Behind this is a dual narrative of continuous technological breakthroughs and the deepening integration of AI with society.

“The U.S. focuses on closed-source, while China leads open-source. This pattern encourages Chinese companies to enter ‘AI + industry’ at lower costs and faster speeds,” Wu Fei said. The activity level at the application end provides intuitive evidence. According to data from OpenRouter, the world’s largest AI model aggregation platform, China accounted for 36% of the total global calls for large models in a week in March this year. The report from the China Development Forum 2026 indicates that as of March this year, China’s daily Token (word token) call volume has exceeded 140 trillion, a more than thousandfold increase from 100 billion at the beginning of 2024. Industry predictions suggest that future Token consumption will present an “80-20 pattern”—about 80% from enterprises and 20% from individuals.

The prerequisite for widespread implementation is high-quality data. Data is like the “electricity” of the new era, needing to pass through the “labeling” conversion station to transform human knowledge into machine-readable forms.

China’s data output accounts for over 25% of the global total and possesses a complete industrial system. However, the issue of “data silos” is prominent: inconsistent standards and circulation barriers result in a large amount of data being “stored but unused.” Han Jian, director of the Information Software Institute at the China Academy of Information and Communications Technology, pointed out that data circulation faces the dilemma of “not daring to transmit, unwilling to transmit, and unable to transmit.”

The situation in the medical field is a microcosm. To enable AI-assisted diagnosis, doctors’ decades of experience must first be transformed into labeled data. However, the reality is that even the test results between hospitals are difficult to recognize. This involves standard differences and also relates to privacy and security risks. Experts at Damo Academy had to deploy separate servers for each cooperating hospital when developing the “CT multi-cancer early screening” system, embedding it into each hospital’s complex processes. But its value is immense—taking pancreatic cancer, known as the “king of cancers,” as an example, early lesions are difficult for the human eye to discern, yet AI is extremely sensitive to grayscale differences. This project has successfully screened multiple extremely early-stage, treatable pancreatic cancer lesions, with the smallest lesion being only 1 centimeter.

To address the standardization issue, many industry leaders are attempting to establish a universal standard similar to “USB” through open-source models and unified platforms, allowing developers to focus more on business innovation rather than reinventing the wheel.

So, how can AI integrate into various industries and become a universal productivity tool? Wu Fei summarizes the paths into two categories: first, “AI +”, where AI professionals enter and transform traditional industries; second, “+ AI”, where experts in various fields actively utilize AI tools to break boundaries, just as chemists use AI to predict protein structures and win Nobel Prizes.

In the face of intelligent agents like “Lobster” that can directly operate computers, Jiang Tianyi also calmly reminds: “Its capabilities are too strong, and its permissions too broad, which may bypass enterprise controls.” Therefore, the team is building an intelligent agent control platform called ClawHive, with the core principle being that “digital employees cannot bypass human supervisors.”

Reflecting on the history of technological development, there have been many cycles of “booms” and “winters.” But this time, the AI wave has truly overflowed its banks, permeating the fabric of society. The future is full of uncertainties, but one thing is certain: the story of AI has shifted from awe at “what it can do” to the practice of “how we coexist with it.” The ultimate answer is being written in every solid industrial landing and in every process of calibrating human wisdom with machine capabilities.

On March 16, 1967, in the UK, the new automated machine “Unimate X” demonstrated its functions. This machine can simulate human movements of the waist, shoulders, elbows, wrists, and fingers, able to pick up eggs and pour tea. Source: Visual China

Reporter’s Note

Expanding Human Possibilities

The wave of AI comes in waves, often making one feel as if returning to the last industrial revolution, back to the early 20th century beside the assembly line created by Henry Ford.

The idea back then was simple: make cars affordable and accessible to ordinary families. Thus, the conveyor belt began to roll, and workers no longer needed to run back and forth; each person stayed in a fixed position, repeating a few simple actions. The complex assembly of a car was broken down into a series of standard steps. Machine tools and stamping machines roared to life, replacing the most tedious and time-consuming aspects of human labor. The results were astonishing—the time to assemble a Model T was significantly reduced, and the price of the car dropped dramatically.

In fact, so-called “black technology” is merely a reconfiguration of existing machines and processes. It is this combination that liberates workers from mechanical labor, allowing them to shift toward more intellectually demanding and valuable work. Efficiency improved, and people were “elevated” in the process.

The evolution of AI follows a similarly clear trajectory: from undertaking repetitive physical labor to handling standardized cognitive work; from enhancing efficiency to ultimately unleashing human creativity—the direction of technological evolution is always about expanding human capability boundaries.

This cannot be achieved simply by piling up technology. Real progress comes from technology addressing real problems and breaking through existing bottlenecks. It must undergo repeated filtering, validation, and iteration to evolve from an idea into a truly reusable method.

This is bound to be a gradual process. From the emergence of the first modern factory to the maturity of the assembly line, humanity took 144 years. Today’s AI is likewise a transformation that spans centuries.

Thus, the true competitiveness of manufacturers does not lie in the height of model parameters but is deeply rooted in their ability to convert technology into productivity—first doing well what can be done, then exploring new avenues, and continuously refining what already exists.

Machine Learning (ML): A multidisciplinary field aimed at enabling computer systems to automatically learn and improve from data without explicit programming through algorithms and statistical models. It employs various methods (such as decision trees, support vector machines, clustering, regression, etc.) to identify patterns in data, make predictions, or support decision-making. As a core component of artificial intelligence, machine learning is a crucial pathway to achieving computer intelligence, widely applied in recommendation systems, speech recognition, financial risk control, autonomous driving, and more.

Artificial Neural Network (ANN): A computational model inspired by the neural networks of biological brains. It consists of many interconnected nodes (“neurons”), typically including an input layer, one or more hidden layers, and an output layer. Each neuron receives input signals, performs a weighted sum, processes it through an activation function, and passes it to the next layer. Neural networks possess adaptive learning capabilities, allowing them to automatically extract features from data and complete tasks such as classification and regression. As an important class of machine learning algorithms, they are widely used in image recognition, natural language processing, control systems, and more.

Deep Learning (DL): An important branch of artificial neural networks, specifically referring to neural network models with multiple hidden layers (i.e., “deep” structures). Due to their depth and complexity, deep learning can automatically learn and extract high-level, abstract feature representations from vast amounts of data, achieving performance that approaches or even surpasses human levels in complex tasks such as image recognition, speech processing, and natural language understanding. Common deep learning models include convolutional neural networks (CNNs), recurrent neural networks (RNNs), and their variants. Deep learning has become a core driving force behind the advancement of artificial intelligence technology.