The Architecture of Discovery

Short version

Demis Hassabis argues that while current large-scale pre-training and reinforcement learning are foundational to AGI, the industry is currently relying on 'duct tape' solutions for memory and reasoning. The next frontier of AI involves moving from passive models to active agents capable of continual learning and genuine scientific hypothesis generation. Ultimately, the goal is to use AI as the ultimate tool to solve 'root node' problems in science, from drug discovery to the creation of a virtual cell.

AGI will likely require solving the 'jagged intelligence' problem where models master complex math but fail at basic logic.
The path to AGI is through active agents that can adapt to specific contexts and perform 'fire and forget' tasks autonomously.
Scientific breakthroughs like AlphaFold represent a pattern for AI success: applying massive combinatorial search to problems with clear objective functions.
Founders building today must account for the high probability of AGI appearing mid-journey, likely by the year 2030.

Watch Original Video on YouTube ↗

To reach artificial general intelligence, we must move beyond brute-force pattern matching toward systems that can reason, remember, and invent.

The Missing Pillars of Intelligence

We have reached a remarkable plateau in the development of artificial intelligence, but we are not yet at the summit. The current paradigm—large-scale pre-training, reinforcement learning from human feedback, and chain-of-thought reasoning—will undoubtedly be part of the final architecture for Artificial General Intelligence (AGI). However, we are still missing one or two fundamental breakthroughs. Specifically, the challenges of continual learning, long-term reasoning, and sophisticated memory remain unsolved. Today, we manage these gaps with what I call 'duct tape' solutions, such as shoving massive amounts of data into a context window.

In biological systems, the brain integrates new knowledge gracefully through processes like sleep-based consolidation. In contrast, our current machines are brute-force. While a million-token context window is impressive, it is an inefficient proxy for true working memory. If you are processing live video over months of a user's life, even ten million tokens won't suffice. We need systems that don't just store everything, but intelligently index and retrieve what is relevant to the decision at hand. Until we crack the ability for a model to learn and adapt in real-time without forgetting its foundation, we are still working with static snapshots of intelligence.

From Models to Active Agents

The transition from passive models to active agents is the most critical shift currently underway. At DeepMind, we have focused on agents since the beginning, using games like Go and StarCraft as tractable environments to prove that systems can accomplish goals and make plans autonomously. Today, we are seeing those early reinforcement learning philosophies return to the mainstream. The thinking modes and reasoning traces we see in modern foundation models are essentially scaled-up versions of the search and planning algorithms we pioneered with AlphaZero.

However, there is a visible gap between the hype surrounding agents and their current utility. We are in an experimental phase where many are setting off swarms of agents for dozens of hours without seeing a commensurate output. For an agent to be truly 'fire and forget,' it needs to understand the specific context of its environment. We haven't yet seen a hit video game designed entirely by an AI because the tools still lack the 'soul' and taste that a human creator provides. In the next six to twelve months, the value will shift from toy demonstrations to agents that provide fundamental efficiency gains in professional workflows.

The Efficiency of the Edge

While the frontier requires the largest possible models to push the boundaries of capability, there is immense power in distillation. We are finding that smaller 'flash' models can often reach 90 to 95 percent of the performance of frontier models at a fraction of the cost and latency. This is not just a commercial necessity for serving billions of users; it is a strategic requirement for the future of privacy and robotics. If you want a robot in your home or an assistant on your glasses, you want a powerful local model that processes personal data on the edge, only delegating to the cloud for massive reasoning tasks.

There is no theoretical limit yet to how much intelligence we can pack into these smaller architectures. We are seeing a world where engineers can do a thousand times the work they could a decade ago because the iteration speed of a fast, small model outweighs the slight edge of a slower, larger one. This democratization of power through open-weights models like Gemma ensures that the 'Western stack' remains competitive and that the tools of the future are in the hands of individual builders, not just a few centralized labs.

AI as the Ultimate Scientific Tool

My lifelong mission has been a two-step process: first, solve intelligence; second, use it to solve everything else. We are now entering the second phase. AlphaFold, which has been used by millions of researchers to predict protein structures, is the prototypical example of how AI can solve 'root node' problems—challenges that, once unlocked, open up entire new branches of discovery. We are now applying this same logic to materials science, climate modeling, and mathematics. The ideal problem for this approach is one with a massive combinatorial search space and a clear objective function.

The next grand challenge is the creation of a 'virtual cell'—a full working simulation that can be perturbed to predict biological outcomes without the need for physical experimentation. We are likely ten years away from a full simulation, but we are starting with the nucleus. The bottleneck is no longer just compute, but the right kind of data. If we could image a live cell at nanometer resolution without destroying it, we could turn biology into a vision problem, which we already know how to solve. Until then, we must rely on building better learned simulators of these complex dynamical systems.

The Einstein Test and the Road to 2030

The ultimate test of AI creativity is what I call the 'Einstein Test.' If you train a system on the scientific knowledge available in 1901, can it independently derive special relativity by 1905? We are not there yet. Current systems are excellent at pattern matching and extrapolation, but they struggle with true analogical reasoning—the kind required to invent a game as beautiful as Go rather than just mastering its moves. To reach that level, we may need to interject in the chain of thought, allowing the system to introspect and correct its own blunders before they manifest.

For those building today, the timeline is the most important variable. I anticipate AGI appearing around 2030. If you are starting a ten-year deep tech journey today, you must assume that AGI will emerge in the middle of your project. This is not a threat, but an opportunity. You should be building systems that can leverage a general-purpose 'brain' as a tool-user. The most defensible companies will be those that combine AI with the world of atoms—medicine, materials, and energy—where there are no shortcuts. If you have conviction in a hard problem, the difficulty is exactly what makes the pursuit worthwhile.

Turn YouTube videos into readable essays you can skim instead of watch.

Recent queries

The Missing Pillars of Intelligence

From Models to Active Agents

The Efficiency of the Edge

AI as the Ultimate Scientific Tool

The Einstein Test and the Road to 2030