Have you ever tried to work in a kitchen with every single pot, pan, and utensil you own cluttered on the countertop? It’s a nightmare. You can’t find the one spatula you need, and the sheer chaos makes it hard to even think about cooking.
Well, our AI agents are facing the exact same problem.
We're building these incredibly powerful agents and connecting them to all sorts of tools—GitHub, Jira, Slack, you name it. But here's the catch: every time we talk to the agent, it has to look at the "instruction manual" (the JSON schema) for every single tool we've given it. Even if it only needs one.
This is causing a massive, and expensive, traffic jam in the AI's "brain," or what we call the context window. And it’s not just a minor inconvenience; it's a huge bottleneck that's costing us money and making our agents less accurate.
But the team at Nous Research just shipped a clever solution in their open-source Hermes Agent called Tool Search, and honestly, it’s a bigger deal than it sounds. Let's break down what it is and why it's making such a difference.
The "MCP Tools Tax": Why Your AI Agent is Overwhelmed
When you hook up multiple tool servers (using the Model Context Protocol, or MCP) to an agent, you’re essentially dumping that entire cluttered kitchen countertop into its workspace for every single task.
Think about it. A real-world Hermes setup with just five servers and about 34 tools can easily rack up prompt sizes of 45,000 tokens. And the shocking part? About half of that—around 22,000 tokens—is just the overhead from the tool instructions alone. Anthropic’s own data shows that tool definitions can eat up a staggering 134,000 tokens before you even try to optimize.
This "MCP Tools Tax," as some are calling it, creates two major headaches:
- It gets expensive, fast. Every time the agent has to process all that extra information, especially at the start of a session, it can cost you anywhere from $0.07 to $0.10 per turn. That adds up quickly.
- It makes the AI dumber. This is the really crucial part. When a model sees hundreds of tools, most of which are irrelevant to the current task, it gets confused. It’s called "decision paralysis." The model struggles to pick the right tool from the sea of options, leading to mistakes and lower accuracy.
So, how do we clean up the countertop without throwing away the tools?
Meet Tool Search: Giving the AI What It Needs, When It Needs It
Tool Search is Hermes Agent’s elegant answer to this mess. It’s an opt-in feature that acts as a smart librarian for your AI's tools.
Instead of shoving the entire library of tool manuals at the model all at once, Tool Search hides them away. It replaces the giant list of tools with just three simple "bridge" tools:
tool_search(query, limit?)- Lets the model search for a tool it needs.tool_describe(name)- Lets the model request the specific instruction manual for one tool.tool_call(name, arguments)- Lets the model actually use the tool it found.
The whole interaction becomes a clean, logical, three-step dance. Imagine you ask the agent to create a new issue on GitHub. Here’s how it plays out with Tool Search:
- The Model Asks: First, the model uses
tool_search("create a github issue"). It's essentially asking the librarian, "Hey, I need something for making a GitHub issue. What do you have?" - The Librarian Responds: The system returns a match, like
{ name: "mcp_github_create_issue" }. - The Model Reads the Manual: Now that it knows the exact tool, it uses
tool_describe("mcp_github_create_issue")to pull up the specific instructions (the JSON schema) for only that tool. - The Model Gets to Work: Finally, it uses
tool_call("mcp_github_create_issue", { title: "...", body: "..." })to execute the task.
See how clean that is? The model only ever looks at the one tool it actually needs for the job. The rest of the clutter stays off the counter. And don't worry, all the important safety features like guardrails and approval prompts still work perfectly on the real tool underneath.
The Results? A Huge Leap in Accuracy
This isn't just a neat trick to save a few tokens. The impact on performance is genuinely stunning. By removing all that noise, the model can think more clearly.
Anthropic ran their own internal evaluations, and the numbers speak for themselves:
- Claude Opus 4: Accuracy jumped from a mediocre 49% all the way up to 74% with Tool Search enabled. That's a massive improvement.
- Claude Opus 4.5: The more advanced model saw its accuracy climb from an already solid 79.5% to an impressive 88.1%.
This proves that "decision paralysis" is a very real problem. When you take away the hundreds of irrelevant options, the model is far less likely to make a mistake. On top of that, Anthropic saw an 85% reduction in the number of tokens used for tool definitions. You get a smarter and cheaper agent.
How Does the Search Actually Work?
So, how does the agent's "librarian" find the right book so effectively?
Under the hood, Hermes uses a classic and reliable information retrieval algorithm called BM25. It takes the model’s search query (like "create a github issue") and intelligently matches it against the names, descriptions, and even the parameter names of all the available tools.
But what if the search term is too common? For instance, what if you search for "github" and almost every tool has "github" in its name? To handle these edge cases, the system has a smart fallback. If BM25 doesn’t find any good matches, it will just do a simple text search for the query within the tool names. It’s a simple but effective safety net.
One other clever design choice: the tool catalog is rebuilt from scratch on every single turn. This might sound inefficient, but it’s a brilliant way to prevent bugs where the stored catalog gets out of sync with the tools that are actually available. It’s always fresh.
You Don't Even Have to Think About It
Here’s the best part. By default, Tool Search runs in auto mode.
This means Hermes Agent is constantly monitoring the situation. It will only activate Tool Search if the tool schemas are taking up more than 10% of the model's available context window. If you only have a few tools connected, it won't do anything—it just passes them through directly so you don't have any unnecessary overhead.
It’s completely self-tuning. If you start a session with a ton of tools, Tool Search will kick in. If you remove some servers mid-session, it will automatically switch back to the direct method on the next turn. You get the benefits exactly when you need them, without having to micromanage a thing.
Setting It Up Yourself
If you want to tweak the behavior, it's super easy. Just add a few lines to your hermes.yaml file:
tools:
tool_search:
enabled: auto # You can set this to 'on', 'off', or 'auto' (the default)
threshold_pct: 10 # The % of context window that triggers 'auto' mode
search_default_limit: 5 # How many search results to show by default
max_search_limit: 20 # The max results the model can ask for
You can even just use tools: tool_search: true as a shorthand for the default auto-mode. It’s designed to be simple to implement and powerful in practice.
So, When Should You Use This?
Tool Search is a fantastic feature, but it's not for every single situation. Here’s a quick guide:
You'll see huge benefits if:
- You have a lot of tools connected (think 15+).
- You have multiple MCP servers running.
- Your agent typically only uses a few of its available tools during any given conversation.
You might want to skip it if:
- You have a very small set of tools. The bridge tools themselves add a tiny bit of overhead (~300 tokens), which might not be worth it for a small setup.
- Your workflow requires the agent to use all of its tools on every single turn (which is pretty rare).
It’s a powerful new capability for anyone building serious, multi-functional AI agents. By tidying up the agent's workspace, Nous Research has managed to make it smarter, faster, and more cost-effective. It’s a perfect example of how a simple, clever idea can solve a complex and growing problem in the world of AI.




