It feels like every week there’s a new "game-changing" AI model, right? The announcements come so fast it’s easy to get numb to them. But every once in a while, one drops that makes you sit up and really pay attention.
This is one of those times.
Anthropic just unveiled its latest model, Claude Opus 4.5, and it’s not just another incremental update. They’ve managed to do three things at once that are a huge deal for anyone building with or using AI: they made it dramatically cheaper, way more efficient, and… well, it’s now officially better at coding than any human they’ve ever tried to hire.
Yeah, you read that right. Let’s get into what this actually means.
The Price Just Dropped Through the Floor
First, let's talk money, because this is a massive change.
Anthropic slashed the price of its top-tier model by about two-thirds. We're talking a drop from $15 down to $5 for input tokens, and from $75 down to $25 for output tokens (per million).
If you’re not a developer, those numbers might not mean much. Think of it like this: imagine the most powerful, top-of-the-line supercomputer suddenly became as cheap to run as your laptop. It completely changes who can afford to use it and what they can build with it. This move puts incredible pressure on rivals like OpenAI and Google and makes truly powerful AI accessible to way more startups and businesses.
As Alex Albert, Anthropic's head of developer relations, put it, "We want to make sure this really works for people who want to work with these models." It’s a clear signal they’re not just chasing benchmark scores; they’re trying to get this technology into more hands.
So, It Beat Every Human on a Coding Test?
Okay, let's get to the headline-grabber. Anthropic has a notoriously difficult take-home exam they give to prospective performance engineers. It's a two-hour, high-pressure test designed to separate the good from the great.
Opus 4.5 didn't just pass. It scored higher than any human candidate in the company's history.
Now, before we all panic, let's be clear. The company is quick to point out that this test doesn't measure everything. It doesn't test for collaboration, communication, or the kind of gut instinct you get from years of experience. But it does measure pure technical skill and judgment under pressure.
And the AI aced it.
This is a pretty wild milestone. As Albert said, "I think this is kind of a sign, maybe, of what's to come... it's a really important signal to pay attention to." It’s one thing to hear about AI passing a bar exam; it’s another to see it outperform top-tier specialists in their own highly technical field.
And it’s not just on their internal tests. On a public benchmark for real-world software engineering tasks (called SWE-bench Verified), Opus 4.5 hit 80.9% accuracy, pulling ahead of OpenAI’s latest and Google’s Gemini 3 Pro.
It's Not Just Smarter, It Has Better 'Judgment'
Benchmarks are one thing, but what does it feel like to use? This is where things get really interesting.
People inside Anthropic who have been testing it say the model has taken a qualitative leap forward. It’s not just about spitting out correct answers; it’s about having a better sense of what actually matters in a real-world context.
Albert described it perfectly: "The model just kind of gets it."
He shared a personal example. Before, he’d use AI to gather information, but he’d never trust it to synthesize or prioritize that info. Now, with Opus 4.5, he’s connecting it to his Slack and internal documents and letting it produce entire summaries, trusting that it understands his priorities. That’s a huge shift from "AI as a search engine" to "AI as a genuine assistant."
Doing More with Way, Way Less
On top of being smarter and cheaper, Opus 4.5 is also incredibly efficient. It uses far fewer "tokens"—the little pieces of text AI models process—to get the job done.
Here’s a crazy stat: at a medium effort level, Opus 4.5 can match the performance of the previous model while using 76% fewer output tokens. That’s like driving from New York to L.A. on a quarter tank of gas.
This efficiency has a real-world impact. Michele Catasta, president of the coding platform Replit, said, "Opus 4.5 beats... competition on our internal benchmarks, using fewer tokens to solve the same problems. At scale, that efficiency compounds."
They've even introduced a new "effort parameter," which lets developers decide how much computational power to throw at a problem. You can crank it up for a complex task or dial it down for something simple, giving you more control over the balance between cost, speed, and performance.
An AI That Can Teach Itself?
This might be the most mind-bending part of the announcement. Early customers are seeing what Anthropic calls "self-improving agents."
Let me explain. The AI isn't fundamentally rewriting its own brain. Instead, it’s learning from experience on a task. The Japanese e-commerce giant Rakuten saw this firsthand. They put Opus 4.5 to work on automating office tasks, and their AI agents were able to "autonomously refine their own capabilities," hitting peak performance in just four tries. Other models couldn't get there even after ten.
Basically, the model tries something, sees the result, and then adjusts its approach for the next attempt. It's iteratively getting better at a skill, which is a huge step toward more autonomous AI systems that can actually learn on the job.
New Tools for the Rest of Us
Alongside the new model, Anthropic rolled out a bunch of practical updates:
- Claude for Excel: Now widely available, it can help you create pivot tables, charts, and more, right inside your spreadsheet.
- Infinite Chats: This is a big one. You no longer have to worry about a conversation getting too long and the AI "forgetting" what you talked about earlier. It automatically summarizes the chat as you go, creating a seemingly infinite memory.
- Better Developer Tools: They’ve also added features that let the AI call code directly and improved the coding environment to let you run multiple AI agents at once.
The AI Race Just Got a Lot More Interesting
Let’s zoom out for a second. This release doesn't happen in a vacuum. OpenAI is pushing out new GPT-5 variants, and Google just shipped Gemini 3. The competition is fierce.
Anthropic is moving at a breakneck pace, and they admit they’re using Claude to help them build the next version of Claude, which is accelerating everything.
For you and me, this AI arms race is mostly a good thing. It means more powerful models, falling prices, and a constant stream of new capabilities. But that result from the engineering test... that’s something else. It’s a clear sign that the impact of AI on professional, white-collar work is no longer a theoretical, "down the road" conversation.
It's happening right now. And as Anthropic’s own team suggests, it’s a really important signal to pay attention to.




