How to Build Your First AI Agent in Python (Step-by-Step)

If you've spent any time in a Java/Spring Boot codebase, you already understand request pipelines — an incoming request flows through filters, middleware, and handlers, with each step able to modify or stop the chain. Building an AI agent is surprisingly similar. The agent loop is just a request handler with an LLM as the decision engine and tools as the downstream services.

Let's build one from scratch.

What an AI Agent Actually Is

Before we write code, let's get the definition straight.

A chatbot takes your input, sends it to a language model, and returns the output. One round trip, then done.

An AI agent does something fundamentally different:

Receives a goal ("Find the current price of gold and convert it to INR")
Plans which steps it needs to take
Uses tools — functions it can call, like a web search or a currency converter
Observes what the tools return
Decides whether the goal is complete or if it needs more steps
Repeats until done

The key insight: the LLM isn't just answering a question. It's deciding what to do next based on what it's learned so far. That's the agent loop — and it's what makes agents so powerful.

What You Need Before Starting

Python 3.10+ — check with python --version
An API key from any major LLM provider (OpenAI, Mistral, Cohere, etc.)
pip for installing dependencies
Basic Python: functions, dictionaries, loops, and a bit of comfort with pip install

No ML background required. We're using the LLM as a black box.

Building the Agent Step by Step

Step 1: Set Up the Project

mkdir my-first-agent
cd my-first-agent
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install openai          # or your provider's SDK

Create a file called agent.py.

Step 2: Define Your Tools

Tools are just Python functions. The agent will decide when to call them.

import math

def search_web(query: str) -> str:
    """Simulate a web search — replace with a real search API."""
    # In production: use SerpAPI, Tavily, or similar
    results = {
        "gold price today": "Gold is trading at $2,380 per troy ounce as of today.",
        "USD to INR exchange rate": "1 USD = 83.47 INR as of today.",
    }
    for key, value in results.items():
        if key.lower() in query.lower():
            return value
    return f"No results found for: {query}"


def calculate(expression: str) -> str:
    """Evaluate a math expression safely."""
    try:
        result = eval(expression, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

Think of these like Spring @Service beans — discrete, single-responsibility units that the agent can invoke on demand.

Step 3: Define the Tool Registry

The agent needs to know which tools exist and what they do. We describe them in a format the LLM understands:

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a mathematical expression",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "A Python-evaluable math expression"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

# Map tool names to functions
TOOL_FUNCTIONS = {
    "search_web": search_web,
    "calculate": calculate,
}

Step 4: Write the Agent Loop

This is the core of the agent. In Spring Boot terms, think of it as a @RequestMapping method that keeps looping until it has a final response — like a polling job that terminates when a condition is met.

import json
import openai

client = openai.OpenAI(api_key="YOUR_API_KEY")

def run_agent(goal: str) -> str:
    messages = [
        {
            "role": "system",
            "content": "You are a helpful assistant. Use tools to answer questions accurately."
        },
        {
            "role": "user",
            "content": goal
        }
    ]

    print(f"\nGoal: {goal}\n")

    while True:
        # Ask the LLM what to do next
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=TOOLS,
            tool_choice="auto"
        )

        message = response.choices[0].message

        # If the LLM wants to call a tool, execute it
        if message.tool_calls:
            messages.append(message)

            for tool_call in message.tool_calls:
                name = tool_call.function.name
                args = json.loads(tool_call.function.arguments)

                print(f"  → Tool: {name}({args})")
                result = TOOL_FUNCTIONS[name](**args)
                print(f"  ← Result: {result}")

                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result
                })

        else:
            # LLM has enough information — return the final answer
            return message.content

Step 5: Test It

if __name__ == "__main__":
    answer = run_agent(
        "What is the current price of gold? Convert it to INR and show the calculation."
    )
    print(f"\nFinal Answer: {answer}")

Run it:

python agent.py

You'll see output like:

Goal: What is the current price of gold? Convert it to INR and show the calculation.

  → Tool: search_web({'query': 'gold price today'})
  ← Result: Gold is trading at $2,380 per troy ounce as of today.

  → Tool: search_web({'query': 'USD to INR exchange rate'})
  ← Result: 1 USD = 83.47 INR as of today.

  → Tool: calculate({'expression': '2380 * 83.47'})
  ← Result: 198658.6

Final Answer: Gold is currently trading at $2,380 per troy ounce.
At today's exchange rate of 1 USD = ₹83.47, that's approximately ₹1,98,659 per troy ounce.

What Makes This an Agent vs a Chatbot?

Three things:

Tool use — the agent can call external functions. A chatbot just generates text.
Multi-step reasoning — it decided on its own to search twice and then calculate. You didn't tell it the steps.
Goal-driven loop — it keeps running until the goal is met, not until it generates one response.

The LLM is acting like the decision layer in a Spring Boot service — taking requests, consulting dependencies (tools), and assembling a final response.

Common Mistakes to Avoid

No error handling on tools — if search_web throws an exception, the loop crashes. Always return a string from tools, even on error.
Infinite loops — add a max_iterations counter. If the agent hasn't finished after 10 steps, something is wrong.
Trusting tool output blindly — tool results go back to the LLM as text. If a tool returns garbage, the LLM will try to work with it.

Where to Go Next

You've built a working agent. Here's what comes next:

Memory — right now the agent forgets everything between runs. Add persistence with a vector database.
Multi-step planning — have the agent write a plan before executing.
Multi-agent systems — multiple specialized agents coordinating on complex tasks.
Production concerns — rate limiting, cost tracking, observability, fallbacks.

If you want a structured path through all of this — with every concept mapped to Java and Spring Boot patterns you already know — the Agentic AI course on MindloomHQ covers exactly this, from Python basics through production multi-agent systems across 96 lessons.

Let's build one from scratch.

What an AI Agent Actually Is

Before we write code, let's get the definition straight.

A chatbot takes your input, sends it to a language model, and returns the output. One round trip, then done.

An AI agent does something fundamentally different:

Receives a goal ("Find the current price of gold and convert it to INR")
Plans which steps it needs to take
Uses tools — functions it can call, like a web search or a currency converter
Observes what the tools return
Decides whether the goal is complete or if it needs more steps
Repeats until done

The key insight: the LLM isn't just answering a question. It's deciding what to do next based on what it's learned so far. That's the agent loop — and it's what makes agents so powerful.

What You Need Before Starting

Python 3.10+ — check with python --version
An API key from any major LLM provider (OpenAI, Mistral, Cohere, etc.)
pip for installing dependencies
Basic Python: functions, dictionaries, loops, and a bit of comfort with pip install

No ML background required. We're using the LLM as a black box.

Building the Agent Step by Step

Step 1: Set Up the Project

mkdir my-first-agent
cd my-first-agent
python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install openai          # or your provider's SDK

Create a file called agent.py.

Step 2: Define Your Tools

Tools are just Python functions. The agent will decide when to call them.

import math

def search_web(query: str) -> str:
    """Simulate a web search — replace with a real search API."""
    # In production: use SerpAPI, Tavily, or similar
    results = {
        "gold price today": "Gold is trading at $2,380 per troy ounce as of today.",
        "USD to INR exchange rate": "1 USD = 83.47 INR as of today.",
    }
    for key, value in results.items():
        if key.lower() in query.lower():
            return value
    return f"No results found for: {query}"


def calculate(expression: str) -> str:
    """Evaluate a math expression safely."""
    try:
        result = eval(expression, {"__builtins__": {}}, {"math": math})
        return str(result)
    except Exception as e:
        return f"Error: {e}"

Think of these like Spring @Service beans — discrete, single-responsibility units that the agent can invoke on demand.

Step 3: Define the Tool Registry

The agent needs to know which tools exist and what they do. We describe them in a format the LLM understands:

TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a mathematical expression",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "A Python-evaluable math expression"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

# Map tool names to functions
TOOL_FUNCTIONS = {
    "search_web": search_web,
    "calculate": calculate,
}

Step 4: Write the Agent Loop

import json
import openai

client = openai.OpenAI(api_key="YOUR_API_KEY")

def run_agent(goal: str) -> str:
    messages = [
        {
            "role": "system",
            "content": "You are a helpful assistant. Use tools to answer questions accurately."
        },
        {
            "role": "user",
            "content": goal
        }
    ]

    print(f"\nGoal: {goal}\n")

    while True:
        # Ask the LLM what to do next
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=TOOLS,
            tool_choice="auto"
        )

        message = response.choices[0].message

        # If the LLM wants to call a tool, execute it
        if message.tool_calls:
            messages.append(message)

            for tool_call in message.tool_calls:
                name = tool_call.function.name
                args = json.loads(tool_call.function.arguments)

                print(f"  → Tool: {name}({args})")
                result = TOOL_FUNCTIONS[name](**args)
                print(f"  ← Result: {result}")

                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": result
                })

        else:
            # LLM has enough information — return the final answer
            return message.content

Step 5: Test It

if __name__ == "__main__":
    answer = run_agent(
        "What is the current price of gold? Convert it to INR and show the calculation."
    )
    print(f"\nFinal Answer: {answer}")

Run it:

python agent.py

You'll see output like:

Goal: What is the current price of gold? Convert it to INR and show the calculation.

  → Tool: search_web({'query': 'gold price today'})
  ← Result: Gold is trading at $2,380 per troy ounce as of today.

  → Tool: search_web({'query': 'USD to INR exchange rate'})
  ← Result: 1 USD = 83.47 INR as of today.

  → Tool: calculate({'expression': '2380 * 83.47'})
  ← Result: 198658.6

Final Answer: Gold is currently trading at $2,380 per troy ounce.
At today's exchange rate of 1 USD = ₹83.47, that's approximately ₹1,98,659 per troy ounce.

What Makes This an Agent vs a Chatbot?

Three things:

Tool use — the agent can call external functions. A chatbot just generates text.
Multi-step reasoning — it decided on its own to search twice and then calculate. You didn't tell it the steps.
Goal-driven loop — it keeps running until the goal is met, not until it generates one response.

The LLM is acting like the decision layer in a Spring Boot service — taking requests, consulting dependencies (tools), and assembling a final response.

Common Mistakes to Avoid

No error handling on tools — if search_web throws an exception, the loop crashes. Always return a string from tools, even on error.
Infinite loops — add a max_iterations counter. If the agent hasn't finished after 10 steps, something is wrong.
Trusting tool output blindly — tool results go back to the LLM as text. If a tool returns garbage, the LLM will try to work with it.

Where to Go Next

You've built a working agent. Here's what comes next:

Memory — right now the agent forgets everything between runs. Add persistence with a vector database.
Multi-step planning — have the agent write a plan before executing.
Multi-agent systems — multiple specialized agents coordinating on complex tasks.
Production concerns — rate limiting, cost tracking, observability, fallbacks.

How to Build Your First AI Agent in Python (Step-by-Step)

What an AI Agent Actually Is

What You Need Before Starting

Building the Agent Step by Step

Step 1: Set Up the Project

Step 2: Define Your Tools

Step 3: Define the Tool Registry

Step 4: Write the Agent Loop

Step 5: Test It

What Makes This an Agent vs a Chatbot?

Common Mistakes to Avoid

Where to Go Next

Get the AI Engineering Newsletter

Ready to build production AI agents?

Related Posts

How to Build Your First AI Agent in Python (Step-by-Step)

What an AI Agent Actually Is

What You Need Before Starting

Building the Agent Step by Step

Step 1: Set Up the Project

Step 2: Define Your Tools

Step 3: Define the Tool Registry

Step 4: Write the Agent Loop

Step 5: Test It

What Makes This an Agent vs a Chatbot?

Common Mistakes to Avoid

Where to Go Next

Get the AI Engineering Newsletter

Ready to build production AI agents?

Related Posts