The Superhuman Learner: Why Thinking Machines is Betting Against OpenAI's Scaling Strategy

Akram Chauhan
Akram Chauhan
7 min read139 views
The Superhuman Learner: Why Thinking Machines is Betting Against OpenAI's Scaling Strategy

In the high-stakes race to build Artificial General Intelligence (AGI), the unofficial motto seems to be "go big or go home." Tech giants like OpenAI, Google, and Anthropic are locked in an arms race, pouring billions of dollars into a simple, powerful idea: that scaling up models, data, and computing power will eventually crack the code of true intelligence. It's a brute-force approach that has, admittedly, yielded some incredible results.

But what if they're all climbing the wrong mountain?

That's the provocative question being posed by Thinking Machines Lab, one of the industry's most valuable and secretive startups. In a recent talk, researcher Rafael Rafailov threw a wrench in the works of the prevailing AI orthodoxy. He argued that the industry has it backward. The secret to AGI isn't about training bigger models; it's about teaching them how to learn better.

Your AI Coding Assistant Has Amnesia

If you've ever worked with a sophisticated AI coding assistant, you've probably felt a strange sense of déjà vu. You spend an hour guiding it, correcting it, and helping it understand the intricacies of your codebase to implement a complex feature. It might even succeed. You log off feeling like you've just onboarded a new team member.

Then you come back the next day, ask it to build the next feature, and... nothing. It has no memory of the previous day's struggle. It makes the same mistakes, asks the same questions, and has to re-learn the entire context from scratch.

As Rafailov puts it, "for the models we have today, every day is their first day of the job." This isn't just an annoying quirk; it's a symptom of a fundamental flaw. Today's AI systems are trained, not taught. They are powerful pattern-matching machines, but they don't internalize experience. A human engineer gets better over time. They build a mental model of the project, learn the shortcuts, and anticipate problems. Our AI tools, for all their power, remain stuck in a perpetual state of Day One.

Kicking the Can: How We're Training AI to Be Lazy

The problem goes deeper than just forgetfulness. Our current training methods actively encourage AI to take shortcuts rather than solve problems robustly. Rafailov points to a classic example that programmers will instantly recognize: the dreaded try/except block.

Often, when a coding agent is unsure if a piece of code will work, it will wrap it in a try/except block. This is essentially a programming instruction that says, "Try to run this code, but if it breaks, just ignore the error and move on." It's the software equivalent of putting duct tape over a warning light on your car's dashboard.

Why does it do this? Because the AI's only goal, its entire reason for being, is to complete the immediate task you gave it. It "understands" that a particular line of code is risky, but it has a limited amount of time and interaction to get the job done. Fixing the root cause is hard. Slapping on some duct tape and moving on is easy.

This "kicking the can down the road" behavior is a direct result of how we train these systems. We reward them for one thing and one thing only: solving the task at hand. Any effort spent on general understanding, robust solutions, or building foundational knowledge is, in the eyes of the training algorithm, a "waste of computation." We're optimizing for short-term wins, not long-term intelligence.

Why Throwing Billions More at AI Won't Create AGI

This is where Rafailov delivers his most direct challenge to the industry's titans. He doesn't believe we're hitting a wall with scaling. In fact, he thinks we're on the cusp of a new paradigm where AI agents become incredibly capable at browsing the web, writing code, and interacting with the world.

In a year or two, he predicts, we'll look back at today's coding assistants the way we now look at the clunky translation models of a few years ago. They will get much, much better. But—and this is the multi-billion-dollar "but"—he argues that general agency is not the same as general intelligence.

The real question, he poses, is whether one more round of scaling, one more massive dataset, and one more supercomputer cluster will finally get us to AGI. His answer is a firm "no."

He believes that under our current paradigms, no amount of scale will be enough. Our models will still lack the one core capability that defines true intelligence: the ability to learn.

Teaching AI to Learn: The Textbook Approach to True Intelligence

So, if the current method is broken, what's the alternative? Rafailov uses a brilliant analogy from education.

Think about how we train a math model today. We give it an incredibly hard problem. We reward it if it gets the right answer. Once it submits its solution, we throw away everything it discovered in the process—any new techniques, any clever abstractions, any theorems it might have derived. Then we give it a brand new problem, and it has to start from zero.

This is, as he points out, "not how science or mathematics works." Humans build on knowledge. We develop entire fields like topology not just to solve one specific problem, but because we recognize the concepts are fundamentally important and will help us solve a whole class of future problems.

Here’s the proposed solution: Instead of giving a model a single, isolated problem, give it a textbook.

Imagine asking an AI to work through a graduate-level physics textbook. It would start with chapter one, work through the exercises, and build a foundation. Then it would move to chapter two, using what it learned in chapter one to tackle more complex ideas.

The objective would shift entirely. Instead of rewarding the AI for how many problems it solved, we would reward its progress. We'd reward its ability to learn, to build abstractions, and to improve its own understanding over time. This concept, known as "meta-learning" or "learning to learn," is the cornerstone of the Thinking Machines philosophy.

The Missing Ingredient for AGI Isn't What You Think

You might assume that creating an AI that can learn requires a revolutionary new model architecture or some sci-fi neural network design. According to Rafailov, the answer is surprisingly "prosaic."

He believes the core engineering and architectural pieces are largely in place. The real bottlenecks are much simpler:

  1. We don't have the right data. We need datasets and environments designed to foster learning, not just task completion.
  2. We don't have the right objectives. We need to reward progress and self-improvement, not just correct answers.

Learning, he explains, is just another algorithm. It takes the current state of the model, processes new data, and produces a stronger, more capable model. If our current AIs can learn general reasoning algorithms from text and code, why can't the next generation learn a learning algorithm itself?

The goal is to create training environments where adaptation, exploration, and self-improvement are necessary for success. If we can do that, Rafailov believes that general-purpose learning algorithms will naturally emerge from large-scale training. We could teach models how to learn in the same way we currently teach them how to reason.

Meet the Superhuman Learner: The Future of Superintelligence

This vision leads to a radically different picture of what the first superintelligence might look like. It won't be a single, god-like oracle that can solve any math problem instantly. It won't be a disembodied brain that's a perfect reasoner.

Instead, Rafailov believes "the first superintelligence will be a superhuman learner."

Imagine a system whose core drive is to explore, acquire information, and self-improve. Equip it with the ability to use computers, conduct research, and interact with the physical world through robotics. This entity wouldn't just answer questions; it would formulate its own theories, design experiments to test them, and relentlessly iterate on its own understanding of the universe. It would be less of an omniscient god and more of a master student with an insatiable curiosity and an infinite capacity to grow.

This is the $12 billion bet that Thinking Machines is making. It's a longer, harder, and far less certain path than the one its competitors are on. The company faces immense pressure, including a recent "full-scale raid" by Meta that poached a co-founder and other key talent. Yet, they seem committed to this differentiated approach.

Rafailov is refreshingly humble about the timeline. In an industry filled with bold predictions of AGI arriving in months, he offers none. He simply states that the task is "very difficult" and will require breakthroughs, but that it's "fundamentally possible."

That quiet conviction might be the most telling detail of all. It suggests that while the rest of the world is racing to build a bigger ladder, Thinking Machines is convinced they're all leaning it against the wrong wall. The real prize isn't building the most powerful thinking machine, but the one that never, ever stops learning.

Tags

OpenAI AGI Thinking Machines Lab AI Scaling Superintelligence

Stay Updated

Get the latest articles and insights delivered straight to your inbox.

We respect your privacy. Unsubscribe at any time.

Aicosoft

AI & Technology News, Insights & Innovation

AICOSOFT delivers cutting-edge AI news, technology breakthroughs, and innovation insights. Stay informed about artificial intelligence, machine learning, robotics, and the latest tech trends shaping tomorrow.

Connect With Us

© 2026 Aicosoft. All rights reserved.