Have you ever watched an AI agent use a tool, like searching the web or running a piece of code, and wondered what’s really going on? It can feel a bit like magic. But I’m here to tell you it’s not magic at all. It’s a loop—a surprisingly simple, elegant loop of logic.
Today, we’re going to pull back the curtain and build that logic ourselves. We're going to get our hands dirty with a fantastic little framework called nanobot. It’s an ultra-lightweight personal AI agent from HKUDS, packing the power of a full-fledged agent into about 4,000 lines of Python. That’s small enough to actually read and understand in an afternoon!
Instead of just installing it and running a command, we’re going to do something way more fun. We’re going to manually recreate its entire brain, piece by piece. We'll wire up tools, build a memory system, give it skills, and even teach it how to delegate tasks. By the end, you won’t just know how to use nanobot; you'll understand how AI agents think.
Let’s get started.
Step 1: Getting Our Tools Ready
First things first, we need to install nanobot and a few other helpful libraries. Nothing fancy here, just a standard pip install. This will grab the core framework, OpenAI’s library for talking to the model, and a couple of others to make things work smoothly.
import sys
import os
import subprocess
# A little helper to make our output look nice
def section(title, emoji=""):
width = 72
print(f"\n{'═' * width}")
print(f" {emoji} {title}")
print(f"{'═' * width}\n")
print("Installing nanobot-ai and its friends...")
subprocess.check_call([
sys.executable, "-m", "pip", "install", "-q",
"nanobot-ai", "openai", "rich", "httpx"
])
print("✅ All set! nanobot-ai is installed.")
import importlib.metadata
nanobot_version = importlib.metadata.version("nanobot-ai")
print(f" Running nanobot-ai version: {nanobot_version}")
Step 2: Handling the All-Important API Key (Securely!)
Alright, this part is crucial. To use an AI model like OpenAI's GPT, we need an API key. But we should never, ever paste our secret keys directly into our code or notebook output. That’s like leaving your house keys under the welcome mat.
Instead, we’ll use a secure method to input the key. If you're in Google Colab, you can use its built-in Secrets manager. Otherwise, we'll use a simple password prompt that hides your input. The key will only exist in memory for our session.
import os
import openai
# Let's get that key securely
try:
# First, try Google Colab's secret manager
from google.colab import userdata
OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")
if not OPENAI_API_KEY:
raise ValueError("Key not set in Colab secrets")
print("✅ Loaded API key from Colab Secrets.")
except Exception:
# If not in Colab, ask for it securely
import getpass
OPENAI_API_KEY = getpass.getpass("Enter your OpenAI API key: ")
print("✅ API key captured securely.")
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
client = openai.OpenAI(api_key=OPENAI_API_KEY)
# Let's test the key to make sure it works
try:
client.models.list()
print("✅ OpenAI API key is valid. Connection successful!")
except Exception as e:
print(f"❌ API key validation failed: {e}")
print(" Please restart and enter a valid key.")
sys.exit(1)
Step 3: Setting Up the Agent's "Home"
Every agent needs a place to live—a workspace to store its configuration, its memories, and the files it works on. We’ll create a hidden directory in our home folder called .nanobot and set up the basic structure.
Think of this like setting up a new desk. We’re creating folders for different things and writing down some initial instructions for our new assistant. We’ll create a few simple text files:
- AGENTS.md: The core instructions for the AI.
- SOUL.md: A file to define its personality.
- USER.md: A profile about us, the user.
- MEMORY.md: The agent's long-term memory bank.
This is a really cool concept in nanobot—the agent's core identity is just a bunch of simple text files you can edit anytime.
import json
from pathlib import Path
# Create the main directory and workspace
NANOBOT_HOME = Path.home() / ".nanobot"
WORKSPACE = NANOBOT_HOME / "workspace"
(WORKSPACE / "memory").mkdir(parents=True, exist_ok=True)
# Basic configuration
config = {
"providers": {"openai": {"apiKey": OPENAI_API_KEY}},
"agents": {"defaults": {"model": "openai/gpt-4o-mini", "workspace": str(WORKSPACE)}},
"tools": {"restrictToWorkspace": True}
}
config_path = NANOBOT_HOME / "config.json"
config_path.write_text(json.dumps(config, indent=2))
print(f"✅ Config written to {config_path}")
# Create the personality and instruction files
(WORKSPACE / "AGENTS.md").write_text("# Agent Instructions\nYou are nanobot, a helpful and concise AI assistant.")
(WORKSPACE / "SOUL.md").write_text("# Personality\n- Friendly and technically precise.")
(WORKSPACE / "USER.md").write_text("# User Profile\n- The user is learning about AI agent architecture.")
(WORKSPACE / "memory/MEMORY.md").write_text("# Long-term Memory\n\n_No memories stored yet._\n")
print("✅ Workspace bootstrap files created.")
Step 4: A Quick Look at the Architecture
Before we start building, let's peek at the blueprint. Nanobot is beautifully simple. It’s built around a central Agent Loop.
Imagine you ask the agent a question. Here’s what happens:
- Context Building: The agent reads its instructions, personality, your user profile, its long-term memory, and your conversation history. It bundles all this up into one big prompt.
- LLM Call: It sends this prompt to the AI model (like
gpt-4o-mini), along with a list of tools it knows how to use. - Tool or Talk? The model decides.
- If it needs a tool (e.g., "get the current time"), it sends back a request to use that tool. We execute the tool, get the result, and feed it back into the loop. Go back to step 1.
- If it has the final answer, it sends back plain text. We show this to the user, and the loop ends.
That’s it! This "think, act, observe" cycle repeats until the task is done. Now, let’s build it.
Step 5: The Heart of the Agent: Building the Main Loop
This is where it gets really interesting. We’re going to write a Python function that mimics nanobot’s core agent loop. We’ll define a few tools the AI can use—like getting the time, doing math, or reading/writing files—and then create the loop that lets the model call them.
import json as _json
import datetime
# Define the tools our agent can use (in OpenAI's required format)
TOOLS = [
{"type": "function", "function": {"name": "get_current_time", "description": "Get the current date and time."}},
{"type": "function", "function": {"name": "calculate", "description": "Evaluate a math expression.", "parameters": {"type": "object", "properties": {"expression": {"type": "string"}}, "required": ["expression"]}}},
{"type": "function", "function": {"name": "read_file", "description": "Read a file in the workspace.", "parameters": {"type": "object", "properties": {"path": {"type": "string"}}, "required": ["path"]}}},
{"type": "function", "function": {"name": "write_file", "description": "Write to a file in the workspace.", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}}},
{"type": "function", "function": {"name": "save_memory", "description": "Save a fact to long-term memory.", "parameters": {"type": "object", "properties": {"fact": {"type": "string"}}, "required": ["fact"]}}}
]
# A function to actually *run* the tools when the LLM asks
def execute_tool(name: str, arguments: dict) -> str:
print(f" -> Executing tool: {name}({arguments})")
if name == "get_current_time":
return str(datetime.datetime.now())
elif name == "calculate":
try:
return str(eval(arguments.get("expression", ""), {"__builtins__": {}}))
except Exception as e:
return f"Error: {e}"
elif name == "read_file":
fpath = WORKSPACE / arguments.get("path", "")
return fpath.read_text() if fpath.exists() else "Error: File not found."
elif name == "write_file":
fpath = WORKSPACE / arguments.get("path", "")
fpath.write_text(arguments.get("content", ""))
return f"Successfully wrote to {arguments.get('path')}"
elif name == "save_memory":
fact = arguments.get("fact", "")
mem_file = WORKSPACE / "memory" / "MEMORY.md"
mem_file.write_text(mem_file.read_text() + f"\n- {fact}\n")
return f"Memory saved: {fact}"
return f"Unknown tool: {name}"
# The main agent loop!
def agent_loop(user_message: str, max_iterations: int = 10):
print(f"\n>> You: {user_message}")
# 1. Build context from our workspace files
system_parts = [ (WORKSPACE / f).read_text() for f in ["AGENTS.md", "SOUL.md", "USER.md"] ]
system_parts.append(f"\n## Your Memory\n{(WORKSPACE / 'memory/MEMORY.md').read_text()}")
system_prompt = "\n\n".join(system_parts)
messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_message}]
for i in range(max_iterations):
print(f"\n--- Iteration {i+1} ---")
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages, tools=TOOLS)
message = response.choices[0].message
# 2. Decide: Tool or Talk?
if message.tool_calls:
print(f"LLM wants to use {len(message.tool_calls)} tool(s)...")
messages.append(message.model_dump()) # Add the assistant's request
for tc in message.tool_calls:
result = execute_tool(tc.function.name, _json.loads(tc.function.arguments))
messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
else:
final_answer = message.content
print(f"\n<< nanobot: {final_answer}\n")
return final_answer
return "Max iterations reached. The agent is stuck in a loop!"
# Let's try it out!
agent_loop("What is the current time? Also, calculate 2^10.")
agent_loop("Write a haiku about AI to 'haiku.txt', then remember that I like poetry.")
See what happened there? For the first prompt, the agent ran get_current_time and calculate one after the other before giving you the final answer. For the second, it used write_file and then save_memory. This is the core of how all modern AI agents work.
Step 6: Giving the Agent a Persistent Memory
Our save_memory tool is a good start, but real agents need a more structured memory system. Nanobot uses two types:
- MEMORY.md: For core, long-term facts that are always loaded.
- Daily Journals (e.g.,
2024-10-27.md): For things that happened on a specific day.
Let's check our memory file and create a daily journal entry to see how the workspace is evolving.
# Let's see what's in our long-term memory
mem_content = (WORKSPACE / "memory" / "MEMORY.md").read_text()
print("Current MEMORY.md contents:")
print(mem_content)
# Now let's create a daily journal entry
today = datetime.datetime.now().strftime("%Y-%m-%d")
daily_file = WORKSPACE / "memory" / f"{today}.md"
daily_file.write_text(f"# Daily Log — {today}\n\n- Explored the nanobot agent loop.\n- Created a haiku about AI.")
print(f"\n✅ Daily journal created: memory/{today}.md")
Over time, nanobot can be configured to consolidate old daily journals into summaries, keeping its memory sharp without getting overloaded with information.
Step 7: Adding "Skills" for Complex Tasks
Sometimes, you need to give the agent more than just a simple tool. You need to teach it a multi-step process or a specific way of thinking. In nanobot, these are called Skills, and they are just simple Markdown files.
A skill file tells the LLM how to approach a certain type of problem. Let's create two skills: one for analyzing data and another for reviewing code.
skills_dir = WORKSPACE / "skills"
skills_dir.mkdir(exist_ok=True)
# Data Analyst Skill
(skills_dir / "data_analyst.md").write_text("""
# Data Analyst Skill
## Description
Analyze data, compute statistics, and provide insights.
## Instructions
1. Identify the data type and structure.
2. Compute mean, median, and range.
3. Look for patterns and outliers.
4. Present findings clearly.
""")
# Code Reviewer Skill (let's make this one always available)
(skills_dir / "code_reviewer.md").write_text("""
# Code Reviewer Skill
## Description
Review code for bugs, security issues, and best practices.
## Instructions
1. Check for common bugs (e.g., SQL injection).
2. Identify security vulnerabilities.
3. Suggest performance improvements.
4. Rate the code quality on a 1-10 scale.
## Always Available
true
""")
print("✅ Custom skills created in the workspace/skills/ directory.")
Now, when we build our prompt, we can include the descriptions of these skills. When the agent sees a task like "review this code," it will know to activate the "Code Reviewer Skill" and follow the instructions we provided. This is a powerful way to guide the agent's behavior without writing a single line of new code.
Step 8: Creating Our Own Custom Tools
The built-in tools are great, but the real power comes from adding your own. Let’s create a few fun, custom tools: one to roll dice, another to get stats about a piece of text, and a third to generate a random password.
The process is the same: define the tool's schema and then add the logic to our execute_tool function.
import random
import string
# Define our new custom tools
CUSTOM_TOOLS = [
{"type": "function", "function": {"name": "roll_dice", "description": "Roll one or more dice.", "parameters": {"type": "object", "properties": {"num_dice": {"type": "integer", "default": 1}, "sides": {"type": "integer", "default": 6}}}}},
{"type": "function", "function": {"name": "text_stats", "description": "Compute stats about a text.", "parameters": {"type": "object", "properties": {"text": {"type": "string"}}, "required": ["text"]}}},
{"type": "function", "function": {"name": "generate_password", "description": "Generate a random secure password.", "parameters": {"type": "object", "properties": {"length": {"type": "integer", "default": 16}}}}}
]
# We need to upgrade our tool executor to handle the new tools
_original_execute = execute_tool
def execute_tool_extended(name: str, arguments: dict) -> str:
print(f" -> Executing tool: {name}({arguments})")
if name == "roll_dice":
n, s = arguments.get("num_dice", 1), arguments.get("sides", 6)
rolls = [random.randint(1, s) for _ in range(n)]
return f"Rolled {n}d{s}: {rolls} (total: {sum(rolls)})"
elif name == "text_stats":
text = arguments.get("text", "")
words = len(text.split())
return f"Stats: {words} words, {len(text)} characters."
elif name == "generate_password":
length = arguments.get("length", 16)
chars = string.ascii_letters + string.digits + "!@#$%"
return ''.join(random.choice(chars) for _ in range(length))
return _original_execute(name, arguments)
# Let's replace the old executor with our new, more powerful one
execute_tool = execute_tool_extended
ALL_TOOLS = TOOLS + CUSTOM_TOOLS # Combine built-in and custom tools
print("✅ Custom tools created and wired up!")
# You would now pass ALL_TOOLS to the agent_loop to make them available.
Just like that, our agent is now a dice-rolling, password-generating, text-analyzing powerhouse. Extending an agent's capabilities is really that simple.
Step 9 & 10: Conversations, Delegation, and Automation
We've covered the core components, but a real framework like nanobot has a few more tricks up its sleeve.
-
Session Management: To have a real conversation, the agent needs to remember what was said just a moment ago. Nanobot has a
SessionManagerthat saves the history of each chat, so when you send a new message, it loads the previous turns. This gives it conversational context. -
Subagents: What if a task is really big, like "research the top 3 competitors for my new app"? The main agent can spawn smaller, specialized "subagents" to handle parts of the task in parallel. One subagent could research competitor A, another competitor B, and so on. They all report back to the main agent when they're done. This is how agents handle complex, multi-step research.
-
Cron Scheduling: You can also set up tasks to run on a schedule, just like a cron job on a server. For example, you could have a job that runs every morning at 8 AM with the message: "Give me a summary of my calendar and the top news headlines." The scheduler triggers the agent, which then goes through its loop to complete the task automatically.
Putting It All Together: A Final, Complex Task
Let's give our agent a final challenge that uses everything we've built: file I/O, tool use, and memory.
# We'll use our upgraded agent loop that knows about ALL_TOOLS
def agent_loop_v2(user_message: str):
# This is a simplified version of our earlier loop, but now using ALL_TOOLS
# and the extended tool executor.
print(f"\n>> You: {user_message}")
system_prompt = "You are a helpful AI assistant with many tools."
messages = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_message}]
for i in range(10): # Let's give it a few tries
response = client.chat.completions.create(model="gpt-4o-mini", messages=messages, tools=ALL_TOOLS)
message = response.choices[0].message
if message.tool_calls:
messages.append(message.model_dump())
for tc in message.tool_calls:
result = execute_tool(tc.function.name, _json.loads(tc.function.arguments))
messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
else:
final_answer = message.content
print(f"\n<< nanobot: {final_answer}\n")
return final_answer
return "Agent got stuck!"
final_prompt = (
"Help me with a project: "
"1. First, check the current time. "
"2. Write a 3-point project plan to 'plan.txt' about building an AI assistant. "
"3. Save the fact 'my current project is building a personal AI assistant' to my memory. "
"4. Read back the 'plan.txt' file to confirm it was saved correctly. "
"Finally, summarize everything you did."
)
agent_loop_v2(final_prompt)
And there you have it. In one request, the agent chained together four different tool calls across multiple iterations to complete a complex task, updating its memory and interacting with the file system along the way.
So, What's Next?
We've just built a microcosm of a modern AI agent. We didn't just run some commands; we wired up the engine, connected the fuel lines, and took it for a spin. The patterns we implemented here—context building, tool dispatch, memory, and delegation—are the fundamental building blocks of much larger, more complex agent systems.
The beauty of nanobot is that it makes these concepts accessible. The entire source code is small enough to explore. I highly recommend checking it out on GitHub. Now that you've built the mini-version, the real thing will make perfect sense.
You can now try running the actual nanobot command-line agent, connect it to Telegram, or even give it web search capabilities. You have the mental model to understand exactly what’s happening under the hood. Go build something amazing




