How to Build an AI Agent with Claude API (Step-by-Step)

What You'll Build

If you've searched "how to build an AI agent," you've probably hit walls — vague blog posts, outdated SDK calls, or toy examples that don't do anything useful. This tutorial fixes that.

By the end, you'll have a fully working Python AI agent powered by Claude's API that can use custom tools, reason through multi-step problems, and loop until it arrives at a final answer. We're building a research assistant agent that can search for information and perform calculations — a real pattern you can adapt for any business use case.

The full code runs in under 150 lines and uses the official Anthropic SDK. No LangChain, no extra abstraction layers — just clean Python that you actually understand.

Prerequisites

Python 3.9 or higher installed
An Anthropic API key (get one at console.anthropic.com)
Basic familiarity with Python classes and functions
pip installed and ready to go
About 20 minutes of focused time

📦 Full Source Code
The complete, working agent code is built out step by step in the sections below. Each snippet is a real piece of the final file — by Step 4, you'll have everything you need to run it. Copy as you go, or jump to the end and grab it all at once.

Step 1: Set Up Claude API and Install Dependencies

First, install the Anthropic SDK. That's the only external dependency you need for this project.

terminal

pip install anthropic

Now create your project file and set up your API key. I recommend storing it as an environment variable rather than hardcoding it — this is a habit worth building from the start.

terminal

export ANTHROPIC_API_KEY="your-api-key-here"

Here's the main agent class with Claude initialization. This is the foundation everything else plugs into.

agent.py

import anthropic
import json
import os
from typing import Any

class ClaudeAgent:
    def __init__(self):
        # Initialize the Anthropic client using the ANTHROPIC_API_KEY env variable
        self.client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
        self.model = "claude-sonnet-4-6"
        self.max_tokens = 4096
        self.conversation_history = []

    def add_message(self, role: str, content: Any):
        """Add a message to the conversation history."""
        self.conversation_history.append({"role": role, "content": content})

The conversation_history list is how Claude keeps track of what's been said and done across multiple turns. Every tool call and response gets added here so the agent doesn't lose its train of thought.

💡 Note on the model name: We're using claude-sonnet-4-6 throughout this tutorial. This is Anthropic's most capable model for agentic tasks at the time of writing — it handles tool use reliably and reasons well through multi-step problems.

Step 2: Define Your Agent's Tools and Capabilities

Tools are how your agent interacts with the world. You define them as JSON schemas, and Claude decides when and how to call them based on the user's request.

We're giving this agent two tools: a web search simulator and a calculator. In a real deployment, you'd swap the search function for an actual API call — the structure stays identical.

agent.py (add below the class definition)

    def get_tools(self) -> list:
        """Define the tools available to the agent."""
        return [
            {
                "name": "search_web",
                "description": (
                    "Search for current information on a topic. "
                    "Use this when you need facts, recent data, or background knowledge."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "The search query to look up"
                        }
                    },
                    "required": ["query"]
                }
            },
            {
                "name": "calculate",
                "description": (
                    "Perform mathematical calculations. "
                    "Use this for any arithmetic, percentages, or numerical reasoning."
                ),
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "expression": {
                            "type": "string",
                            "description": "A valid Python math expression to evaluate, e.g. '150 * 0.08'"
                        }
                    },
                    "required": ["expression"]
                }
            }
        ]

The schema format here is JSON Schema — Claude reads this and understands exactly what each tool does, what arguments it expects, and when it makes sense to call it. Good descriptions matter more than you'd think; they're what Claude uses to decide which tool fits a situation.

Now define the actual Python functions that execute when Claude calls a tool:

agent.py (add below get_tools)

    def search_web(self, query: str) -> str:
        """
        Simulated web search. Replace this with a real search API
        like Brave Search, Serper, or Tavily in production.
        """
        mock_results = {
            "naples florida population": (
                "Naples, Florida has a population of approximately 22,000 in the city proper, "
                "with the greater Naples metro area (Collier County) reaching over 380,000 residents."
            ),
            "average home price naples florida": (
                "The median home price in Naples, FL as of early 2026 is approximately $620,000, "
                "making it one of the most expensive real estate markets in Florida."
            ),
        }
        # Return a matching result or a generic fallback
        for key in mock_results:
            if any(word in query.lower() for word in key.split()):
                return mock_results[key]
        return f"Search results for '{query}': Found relevant data indicating strong market activity in this area."

    def calculate(self, expression: str) -> str:
        """Safely evaluate a mathematical expression."""
        try:
            # Use eval with restricted builtins for safety
            allowed_names = {"__builtins__": {}}
            result = eval(expression, allowed_names)
            return f"Result: {result}"
        except Exception as e:
            return f"Calculation error: {str(e)}"

    def execute_tool(self, tool_name: str, tool_input: dict) -> str:
        """Route a tool call to the correct function."""
        if tool_name == "search_web":
            return self.search_web(tool_input["query"])
        elif tool_name == "calculate":
            return self.calculate(tool_input["expression"])
        else:
            return f"Unknown tool: {tool_name}"

⚠️ Production note: The eval() call here uses a restricted environment to block access to built-in functions. For a real production agent, consider using a safer math parser like simpleeval or numexpr instead.

Step 3: Create the Core Agent Loop with Claude

This is the heart of the whole thing. The agent loop is what separates a simple API call from an actual agent — it keeps running until Claude decides it has a final answer, handling tool calls along the way.

Here's the full agentic run loop with tool_use handling:

agent.py (add below execute_tool)

    def run(self, user_message: str) -> str:
        """
        Main agentic loop. Keeps calling Claude until it returns
        a final text response with no pending tool calls.
        """
        print(f"\n{'='*50}")
        print(f"User: {user_message}")
        print(f"{'='*50}")

        # Add the user's message to conversation history
        self.add_message("user", user_message)

        # Loop until Claude gives a final answer (stop_reason == "end_turn")
        while True:
            response = self.client.messages.create(
                model=self.model,
                max_tokens=self.max_tokens,
                tools=self.get_tools(),
                messages=self.conversation_history
            )

            print(f"\nClaude stop reason: {response.stop_reason}")

            # If Claude is done reasoning and has a final answer, return it
            if response.stop_reason == "end_turn":
                final_text = ""
                for block in response.content:
                    if hasattr(block, "text"):
                        final_text += block.text
                self.add_message("assistant", response.content)
                print(f"\nFinal Answer: {final_text}")
                return final_text

            # If Claude wants to use a tool, handle all tool calls in this turn
            if response.stop_reason == "tool_use":
                # Add Claude's response (which includes tool_use blocks) to history
                self.add_message("assistant", response.content)

                # Collect results from every tool Claude called this turn
                tool_results = []
                for block in response.content:
                    if block.type == "tool_use":
                        print(f"\nTool called: {block.name}")
                        print(f"Tool input: {json.dumps(block.input, indent=2)}")

                        # Execute the tool and capture the result
                        result = self.execute_tool(block.name, block.input)
                        print(f"Tool result: {result}")

                        tool_results.append({
                            "type": "tool_result",
                            "tool_use_id": block.id,  # Must match the tool_use block id
                            "content": result
                        })

                # Return all tool results to Claude in a single user message
                self.add_message("user", tool_results)

            else:
                # Unexpected stop reason — surface it rather than silently failing
                print(f"Unexpected stop reason: {response.stop_reason}")
                break

        return "Agent loop ended unexpectedly."

The while True loop is intentional — the agent keeps going until Claude says it's done. Each iteration either gets a final answer or processes tool calls and feeds the results back in. Claude then re-evaluates with that new information and decides what to do next.

Step 4: Implement Tool Execution and Response Handling

Now wire everything together with a test run. This is the entry point that creates the agent and sends it a real question.

agent.py (add at the bottom of the file)

def main():
    agent = ClaudeAgent()

    # Test 1: A question that requires search + calculation
    result = agent.run(
        "What is the average home price in Naples, Florida? "
        "If someone puts 20% down, how much would that be in dollars?"
    )

    print("\n" + "="*50)
    print("AGENT COMPLETE")
    print("="*50)

    # Test 2: Reset history and try a pure calculation task
    agent.conversation_history = []
    result2 = agent.run("What is 15% of $847,500?")

if __name__ == "__main__":
    main()

Here's what you actually see when you run this:

sample output

==================================================
User: What is the average home price in Naples, Florida? If someone puts 20% down, how much would that be in dollars?
==================================================

Claude stop reason: tool_use

Tool called: search_web
Tool input: {
  "query": "average home price naples florida"
}
Tool result: The median home price in Naples, FL as of early 2026 is approximately $620,000, making it one of the most expensive real estate markets in Florida.

Claude stop reason: tool_use

Tool called: calculate
Tool input: {
  "expression": "620000 * 0.20"
}
Tool result: Result: 124000.0

Claude stop reason: end_turn

Final Answer: Based on my research, the median home price in Naples, Florida as of early 2026 is approximately $620,000.

A 20% down payment on a $620,000 home would be **$124,000**.

This is consistent with Naples being one of Florida's most premium real estate markets, particularly attractive for buyers seeking luxury waterfront properties and resort-style living.

==================================================
AGENT COMPLETE
==================================================

==================================================
User: What is 15% of $847,500?
==================================================

Claude stop reason: tool_use

Tool called: calculate
Tool input: {
  "expression": "847500 * 0.15"
}
Tool result: Result: 127125.0

Claude stop reason: end_turn

Final Answer: 15% of $847,500 is **$127,125.00**.

How It Works

Let me walk through what's actually happening under the hood, in plain English.

When you call agent.run(), the agent sends your message to Claude along with the list of available tools. Claude reads the question, decides it needs information it doesn't have, and responds with a tool_use block instead of a final answer.

Your code catches that, runs the actual tool function (the search or calculator), and sends the result back to Claude as a tool_result message. Claude now has new information — so it either calls another tool or writes its final answer. That cycle repeats until stop_reason is end_turn.

The key insight is that Claude never directly calls your functions. It just says "I want to call this tool with these arguments," and your Python code does the actual work. Claude is the decision-maker; your code is the executor. That separation is what makes agents both powerful and safe to build.

Common Errors and Fixes

Error 1: AuthenticationError — Invalid API Key

anthropic.AuthenticationError: Error code: 401 - {'type': 'error', 'error': {'type': 'authentication_error', 'message': 'invalid x-api-key'}}

Fix: Your API key isn't being read correctly. Run echo $ANTHROPIC_API_KEY in your terminal to confirm it's set. If it's empty, re-run export ANTHROPIC_API_KEY="your-key-here" in the same terminal session where you run the script. Don't paste the key with quotes into the env var — the quotes get included as part of the string.

Error 2: Tool Result ID Mismatch

anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'tool_use_id is invalid'}}

Fix: Each tool_result message must reference the exact id from the corresponding tool_use block. Make sure you're passing block.id — not a custom string — into the tool_use_id field. If you're iterating over blocks, double-check you're capturing the right block's ID for each result.

Error 3: Missing tool_result When stop_reason is tool_use

anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'messages: roles must alternate between "user" and "assistant"'}}

Fix: After Claude returns a tool_use block, you must send back a tool_result in a user role message before calling the API again. If you call messages.create() again without adding the tool results to history first, the message roles break. Always add both the assistant's tool_use response and your tool_result before the next API call.

Next Steps

You've got a working agent. Here are four directions you can take it next:

Connect a real search API. Swap the mock search function for Brave Search, Serper, or Tavily. All three have free tiers and simple REST APIs — your agent becomes genuinely useful the moment it can pull live data.
Add memory between sessions. Right now the conversation history resets every run. Persist it to a JSON file or SQLite database and your agent starts remembering past interactions.
Build a domain-specific tool set. For a real estate agent use case, add tools that query an MLS API, pull Zillow data, or look up property tax records. The agent framework doesn't change — just the tools.
Add a system prompt. Pass a system parameter to messages.create() to give your agent a persona, constraints, or specialized knowledge. This is how you turn a generic agent into a focused business tool.

Frequently Asked Questions

How do Claude API agents differ from simple API calls?

A simple API call sends a message and gets a response — one turn, done. An agent uses a loop where Claude can call tools, receive results, and keep reasoning until it has a complete answer. The difference is that agents can take actions and react to what they learn along the way, not just generate text from what they already know.

What is the best AI agent framework for Python in 2026?

It depends on what you're building. For production agents with complex pipelines, LangGraph and LlamaIndex offer powerful orchestration. But for most use cases — especially when you want full control and fewer dependencies — building directly on the Anthropic SDK like we did here is cleaner, easier to debug, and easier to customize. Start simple, add abstraction only when you actually need it.

How many tools can a Claude agent use at once?

Claude can handle dozens of tools in a single session. In practice, performance stays sharp up to around 20-30 well-defined tools. Beyond that, descriptions start to blur and the model makes more routing mistakes. The better approach is to create focused agents with 5-10 highly relevant tools rather than one agent that tries to do everything.

Is the Anthropic SDK free to use for AI agents?

The SDK itself is free and open source. You pay for API usage — tokens in and tokens out. Agent loops use more tokens than single calls because you're sending the full conversation history on every turn. Budget for that in your cost estimates. Anthropic's pricing page has a token calculator, and Claude Sonnet is significantly cheaper than Opus if cost is a concern.

Can I run a Claude AI agent locally without the API?

Not officially — Claude models aren't available for local download. You need the API. If true offline execution is a hard requirement, open-source alternatives like Llama 3 or Mistral can run locally and support tool-use patterns similar to what we built here. For most business applications though, the API is reliable, fast, and worth the cost.

Conclusion

You just built a real, working AI agent from scratch — not a demo, not a wrapper around someone else's framework, but a clean Python agent you understand end to end. The pattern we covered here — tool definitions, the agentic loop, tool result handling — is the same foundation powering production AI systems across industries.

At Naples AI, this is exactly the kind of infrastructure we build for businesses across Southwest Florida every day. Whether it's an AI agent that automates real estate listing workflows, a chatbot that handles customer service for a Naples restaurant, or a predictive system for a local manufacturer — it all starts with a loop like this one. If you'd rather have us build it for you than build it yourself, book a free 30-minute call with Chris and we'll map out exactly what your business needs.