Back to HomeAI Agent

AI Agent Tutorial: Build Your First AI Agent from Scratch

16 min min read
#AI Agent#Python#LangChain#Tutorial#Development#OpenAI#GPT-4#API#Automation

AI Agent Tutorial: Build Your First AI Agent from Scratch

AI Agent Tutorial: Build Your First AI Agent from Scratch

"I've read tons of AI Agent articles, but still don't know how to actually build one."

This is the feedback we hear most often. There's a lot of theory, but too few tutorials that actually work. Even worse, many tutorials use outdated APIs that don't run when you follow along.

This tutorial will take you from zero, step by step, to build a real working AI Agent. We'll use Python and LangChain, which is currently the most mainstream combination. When finished, you'll have an AI Agent that can search the web, query weather, and perform calculations—and you'll understand what every line of code does.

If you're not yet familiar with the basic concepts of AI Agent, we recommend first reading What is AI Agent? Complete Guide. If you want to know what tools are available, check out AI Agent Tools Comparison.

Before You Start: What You Need

Technical Requirements

  • Python 3.9+: This tutorial uses Python 3.11
  • Basic Python knowledge: Understanding of functions, classes, package installation
  • Command line operation: Ability to execute commands in terminal
  • Code editor: VS Code, PyCharm, or any editor you're comfortable with

Accounts and API Keys

  • OpenAI API Key: For calling GPT models

    • Go to platform.openai.com to register
    • Create a new Key on the API Keys page
    • Payment method binding required (free credits available)
  • SerpAPI Key (optional): For web search functionality

    • Go to serpapi.com to register
    • Free plan includes 100 queries per month

Estimated Time

  • Environment setup: 15 minutes
  • Basic Agent implementation: 30 minutes
  • Advanced features: 30 minutes
  • Total approximately 1.5 hours

Step 1: Environment Setup

Create Project Directory

mkdir ai-agent-tutorial
cd ai-agent-tutorial

Create Virtual Environment

Using a virtual environment is a Python development best practice to avoid package conflicts.

# Create virtual environment
python -m venv venv

# Activate virtual environment (Mac/Linux)
source venv/bin/activate

# Activate virtual environment (Windows)
.\venv\Scripts\activate

Install Required Packages

pip install langchain langchain-openai langchain-community python-dotenv

Package descriptions:

  • langchain: Core framework
  • langchain-openai: OpenAI integration
  • langchain-community: Community tool integrations
  • python-dotenv: Environment variable management

Set Up Environment Variables

Create a .env file to store API Keys (don't commit this file to version control):

# .env
OPENAI_API_KEY=your-openai-api-key-here
SERPAPI_API_KEY=your-serpapi-key-here

Verify Installation

Create test_setup.py to verify the environment is correct:

# test_setup.py
from dotenv import load_dotenv
import os

load_dotenv()

# Check environment variables
api_key = os.getenv("OPENAI_API_KEY")
if api_key:
    print("✅ OpenAI API Key is set")
else:
    print("❌ OpenAI API Key not found")

# Test LangChain import
try:
    from langchain_openai import ChatOpenAI
    print("✅ LangChain installed successfully")
except ImportError as e:
    print(f"❌ LangChain import failed: {e}")

Run the test:

python test_setup.py

Seeing two green checkmarks means environment setup is complete.

Step 2: Build the Simplest Agent

Start with Conversation

Before building a complex Agent, let's first confirm we can communicate normally with the LLM.

Create 01_simple_chat.py:

# 01_simple_chat.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

load_dotenv()

# Initialize LLM
llm = ChatOpenAI(
    model="gpt-4o-mini",  # Use cheaper model for testing
    temperature=0.7
)

# Send message
response = llm.invoke([
    HumanMessage(content="Explain what an AI Agent is in one sentence")
])

print(response.content)

Run:

python 01_simple_chat.py

You should see GPT's brief explanation of AI Agent. This confirms the API connection is working.

Add Tools: Enable the Agent to "Do Things"

Regular LLMs can only answer questions. The key to an Agent is being able to use tools. Let's first create a simple calculation tool.

Create 02_agent_with_tool.py:

# 02_agent_with_tool.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate

load_dotenv()

# Define tools
@tool
def calculate(expression: str) -> str:
    """Calculate a mathematical expression. Input should be a valid Python math expression."""
    try:
        result = eval(expression)
        return f"Calculation result: {result}"
    except Exception as e:
        return f"Calculation error: {str(e)}"

@tool
def get_current_time() -> str:
    """Get the current date and time."""
    from datetime import datetime
    now = datetime.now()
    return f"Current time: {now.strftime('%Y-%m-%d %H:%M:%S')}"

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Define tool list
tools = [calculate, get_current_time]

# Create Prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that can use tools to answer questions."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create Agent
agent = create_tool_calling_agent(llm, tools, prompt)

# Create executor
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Test
if __name__ == "__main__":
    # Test calculation
    result = agent_executor.invoke({
        "input": "Please calculate (123 + 456) * 2 for me"
    })
    print(f"\nResult: {result['output']}\n")

    # Test time
    result = agent_executor.invoke({
        "input": "What time is it now?"
    })
    print(f"\nResult: {result['output']}\n")

After running, you'll see the Agent's thinking process (because we set verbose=True):

> Entering new AgentExecutor chain...
Invoking: `calculate` with `{'expression': '(123 + 456) * 2'}`
Calculation result: 1158
The calculation result is 1158.
> Finished chain.

Result: The calculation result is 1158.

This is the basic operation of an AI Agent:

  1. Receive user input
  2. Determine which tool to use
  3. Call the tool and get results
  4. Generate response based on results

Code Analysis

Let's break down the key parts:

@tool decorator

@tool
def calculate(expression: str) -> str:
    """Calculate a mathematical expression. Input should be a valid Python math expression."""

The @tool decorator converts a regular function into a tool the Agent can use. The docstring is very important because the LLM uses this description to decide when to use this tool.

AgentExecutor

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

AgentExecutor is the Agent's runtime environment, responsible for:

  • Managing the Agent's execution loop
  • Handling tool calls
  • Tracking execution state

Step 3: Build a Practical AI Agent

Now let's build a more practical Agent that integrates multiple tools.

Add Web Search Capability

Create 03_practical_agent.py:

# 03_practical_agent.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
import requests
import os

load_dotenv()

# Tool 1: Calculator
@tool
def calculator(expression: str) -> str:
    """
    Calculate a mathematical expression.

    Args:
        expression: A valid Python math expression, e.g., "2 + 2" or "100 / 4"

    Returns:
        String with calculation result
    """
    try:
        # Security check: only allow math operations
        allowed_chars = set('0123456789+-*/.() ')
        if not all(c in allowed_chars for c in expression):
            return "Error: Expression contains disallowed characters"

        result = eval(expression)
        return f"{expression} = {result}"
    except Exception as e:
        return f"Calculation error: {str(e)}"

# Tool 2: Weather query (using free API)
@tool
def get_weather(city: str) -> str:
    """
    Query the weather for a specified city.

    Args:
        city: City name, e.g., "Taipei" or "Tokyo"

    Returns:
        String with weather information
    """
    try:
        # Using wttr.in free weather API
        url = f"https://wttr.in/{city}?format=%C+%t+%h"
        response = requests.get(url, timeout=10)

        if response.status_code == 200:
            return f"Weather in {city}: {response.text.strip()}"
        else:
            return f"Unable to get weather for {city}"
    except Exception as e:
        return f"Error querying weather: {str(e)}"

# Tool 3: Web search (using SerpAPI, requires API Key)
@tool
def web_search(query: str) -> str:
    """
    Search for information on the web.

    Args:
        query: Search keywords

    Returns:
        Search result summary
    """
    api_key = os.getenv("SERPAPI_API_KEY")

    if not api_key:
        return "Search function not configured (requires SERPAPI_API_KEY)"

    try:
        url = "https://serpapi.com/search"
        params = {
            "q": query,
            "api_key": api_key,
            "num": 3  # Only get first 3 results
        }
        response = requests.get(url, params=params, timeout=15)
        data = response.json()

        if "organic_results" in data:
            results = []
            for item in data["organic_results"][:3]:
                title = item.get("title", "")
                snippet = item.get("snippet", "")
                results.append(f"- {title}: {snippet}")
            return "\n".join(results)
        else:
            return "No relevant results found"
    except Exception as e:
        return f"Search error: {str(e)}"

# Tool 4: Date and time
@tool
def get_datetime() -> str:
    """Get the current date, time, and day of week."""
    from datetime import datetime
    now = datetime.now()
    weekday = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"][now.weekday()]
    return f"It is now {now.strftime('%Y-%m-%d')} {weekday} {now.strftime('%H:%M:%S')}"

# Initialize LLM
llm = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0  # Agent recommended to use lower temperature for stability
)

# Define tool list
tools = [calculator, get_weather, web_search, get_datetime]

# Create Prompt template
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful AI assistant that can use the following tools:
    - calculator: Perform math calculations
    - get_weather: Query weather
    - web_search: Search web information
    - get_datetime: Get current time

    Based on the user's question, determine whether to use tools and which tool to use."""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create Agent
agent = create_tool_calling_agent(llm, tools, prompt)

# Create executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    max_iterations=5,  # Limit max iterations to avoid infinite loops
    handle_parsing_errors=True  # Handle parsing errors
)

def chat(user_input: str) -> str:
    """Function to chat with Agent"""
    try:
        result = agent_executor.invoke({"input": user_input})
        return result["output"]
    except Exception as e:
        return f"Error occurred: {str(e)}"

# Interactive conversation
if __name__ == "__main__":
    print("=" * 50)
    print("AI Agent started! Type 'quit' to exit")
    print("=" * 50)

    while True:
        user_input = input("\nYou: ").strip()

        if user_input.lower() in ['quit', 'exit', 'q']:
            print("Goodbye!")
            break

        if not user_input:
            continue

        print("\nAgent thinking...")
        response = chat(user_input)
        print(f"\nAgent: {response}")

Run this Agent:

python 03_practical_agent.py

Now you can have a conversation with it:

You: What's the weather like in Taipei now?
Agent: Weather in Taipei: Partly cloudy +28°C 70%

You: Calculate for me: if I have $50,000 at 3% annual interest, how much will it be in 5 years?
Agent: 50000 * (1 + 0.03) ** 5 = 57963.70...

You: What day is it today?
Agent: It is now 2025-01-15 Wednesday 14:30:25

Step 4: Add Memory Functionality

So far, our Agent is "memoryless"—each conversation is independent. Adding memory allows the Agent to remember previous conversations.

Create 04_agent_with_memory.py:

# 04_agent_with_memory.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

load_dotenv()

# Define tools (simplified version)
@tool
def calculator(expression: str) -> str:
    """Calculate a mathematical expression."""
    try:
        allowed_chars = set('0123456789+-*/.() ')
        if not all(c in allowed_chars for c in expression):
            return "Error: Expression contains disallowed characters"
        result = eval(expression)
        return f"{expression} = {result}"
    except Exception as e:
        return f"Calculation error: {str(e)}"

@tool
def get_datetime() -> str:
    """Get the current date and time."""
    from datetime import datetime
    now = datetime.now()
    return f"It is now {now.strftime('%Y-%m-%d %H:%M')}"

# Initialize LLM
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Tool list
tools = [calculator, get_datetime]

# Create Prompt template with memory
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a helpful AI assistant.
    You can remember our previous conversations."""),
    MessagesPlaceholder(variable_name="chat_history"),  # Conversation history
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

# Create Agent
agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Create memory storage (using session_id to distinguish different conversations)
message_histories = {}

def get_session_history(session_id: str):
    if session_id not in message_histories:
        message_histories[session_id] = ChatMessageHistory()
    return message_histories[session_id]

# Wrap as Agent with memory
agent_with_memory = RunnableWithMessageHistory(
    agent_executor,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

def chat(user_input: str, session_id: str = "default") -> str:
    """Chat with Agent with memory"""
    try:
        result = agent_with_memory.invoke(
            {"input": user_input},
            config={"configurable": {"session_id": session_id}}
        )
        return result["output"]
    except Exception as e:
        return f"Error occurred: {str(e)}"

# Test memory function
if __name__ == "__main__":
    print("=" * 50)
    print("AI Agent with memory started!")
    print("=" * 50)

    # Test memory function
    print("\n--- Testing Memory Function ---")

    print("\nYou: My name is John")
    print(f"Agent: {chat('My name is John')}")

    print("\nYou: My favorite number is 7")
    print(f"Agent: {chat('My favorite number is 7')}")

    print("\nYou: What's my name? What's my favorite number?")
    print(f"Agent: {chat(\"What's my name? What's my favorite number?\")}")

    print("\nYou: Calculate my favorite number times itself: 7 * 7")
    print(f"Agent: {chat('Calculate my favorite number times itself: 7 * 7')}")

After running, you'll find the Agent can remember previous conversation content:

You: My name is John
Agent: Hello, John! Nice to meet you. Is there anything I can help you with?

You: What's my name?
Agent: Your name is John.

[CTA-ai]

Step 5: Error Handling and Best Practices

Error Handling

Create 05_robust_agent.py with complete error handling:

# 05_robust_agent.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain.callbacks import StdOutCallbackHandler
import logging

load_dotenv()

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# Tool definition with error handling
@tool
def safe_calculator(expression: str) -> str:
    """
    Safe mathematical calculator. Only accepts basic math operations.

    Args:
        expression: Math expression, e.g., "2 + 2"
    """
    logger.info(f"Calculating expression: {expression}")

    # Input validation
    if not expression or len(expression) > 100:
        return "Error: Expression invalid or too long"

    allowed_chars = set('0123456789+-*/.() ')
    if not all(c in allowed_chars for c in expression):
        return "Error: Expression contains disallowed characters. Only numbers and basic operators (+-*/) allowed"

    try:
        result = eval(expression)

        # Result validation
        if isinstance(result, (int, float)):
            return f"Calculation result: {result}"
        else:
            return "Error: Abnormal result type"

    except ZeroDivisionError:
        return "Error: Cannot divide by zero"
    except SyntaxError:
        return "Error: Expression syntax error"
    except Exception as e:
        logger.error(f"Calculation error: {e}")
        return f"Unexpected error occurred during calculation"

# Create Agent with retry mechanism
def create_robust_agent():
    llm = ChatOpenAI(
        model="gpt-4o-mini",
        temperature=0,
        max_retries=3,  # API call retry count
        request_timeout=30  # Request timeout
    )

    tools = [safe_calculator]

    prompt = ChatPromptTemplate.from_messages([
        ("system", """You are a careful AI assistant.
        When tools return errors, clearly explain the problem to the user and provide suggestions."""),
        ("human", "{input}"),
        ("placeholder", "{agent_scratchpad}"),
    ])

    agent = create_tool_calling_agent(llm, tools, prompt)

    agent_executor = AgentExecutor(
        agent=agent,
        tools=tools,
        verbose=True,
        max_iterations=3,  # Limit iteration count
        max_execution_time=60,  # Max execution time (seconds)
        handle_parsing_errors=True,  # Handle parsing errors
        return_intermediate_steps=True,  # Return intermediate steps (for debugging)
    )

    return agent_executor

def safe_chat(agent_executor, user_input: str) -> dict:
    """Safe chat function with complete error handling"""

    # Input validation
    if not user_input or len(user_input) > 1000:
        return {
            "success": False,
            "output": "Input invalid or too long",
            "error": None
        }

    try:
        result = agent_executor.invoke({"input": user_input})
        return {
            "success": True,
            "output": result["output"],
            "steps": result.get("intermediate_steps", [])
        }
    except TimeoutError:
        logger.error("Request timeout")
        return {
            "success": False,
            "output": "Request timeout, please try again later",
            "error": "timeout"
        }
    except Exception as e:
        logger.error(f"Execution error: {e}")
        return {
            "success": False,
            "output": "An error occurred, please try again later",
            "error": str(e)
        }

if __name__ == "__main__":
    agent = create_robust_agent()

    # Test various cases
    test_cases = [
        "Calculate 10 + 20",
        "Calculate 100 / 0",  # Division by zero
        "Calculate abc + 123",  # Invalid input
    ]

    for test in test_cases:
        print(f"\nTest: {test}")
        result = safe_chat(agent, test)
        print(f"Result: {result['output']}")

Best Practices Checklist

1. Tool Design

  • Write clear docstrings for each tool
  • Add input validation to prevent injection attacks
  • Handle all possible error cases

2. Performance Optimization

  • Use cheaper models for testing (like gpt-4o-mini)
  • Set reasonable timeout and retry counts
  • Limit Agent's maximum iterations

3. Security

  • Don't hardcode API Keys in code
  • Use environment variables or secret management services
  • Limit tool permission scope

4. Monitoring and Debugging

  • Enable verbose mode to observe Agent behavior
  • Log for post-analysis
  • Use LangSmith for advanced monitoring

Step 6: Deploy Your Agent

Create API Service

Use FastAPI to wrap the Agent as an API service:

# api_server.py
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from dotenv import load_dotenv
# Import your Agent (assuming already defined)

load_dotenv()

app = FastAPI(title="AI Agent API")

class ChatRequest(BaseModel):
    message: str
    session_id: str = "default"

class ChatResponse(BaseModel):
    response: str
    success: bool

@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(request: ChatRequest):
    try:
        # Call your Agent
        response = chat(request.message, request.session_id)
        return ChatResponse(response=response, success=True)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/health")
async def health_check():
    return {"status": "healthy"}

Start the service:

pip install fastapi uvicorn
uvicorn api_server:app --reload --port 8000

Docker Deployment

Create Dockerfile:

FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

EXPOSE 8000

CMD ["uvicorn", "api_server:app", "--host", "0.0.0.0", "--port", "8000"]

Create requirements.txt:

langchain
langchain-openai
langchain-community
python-dotenv
fastapi
uvicorn
requests

For deeper understanding of different framework technical details, refer to AI Agent Frameworks Deep Dive.

Complete Project Structure

The final project structure should look like this:

ai-agent-tutorial/
├── .env                    # Environment variables (don't commit to git)
├── .gitignore
├── requirements.txt
├── Dockerfile
├── test_setup.py           # Environment test
├── 01_simple_chat.py       # Basic conversation
├── 02_agent_with_tool.py   # Adding tools
├── 03_practical_agent.py   # Practical Agent
├── 04_agent_with_memory.py # Agent with memory
├── 05_robust_agent.py      # Error handling
└── api_server.py           # API service

Next Steps for Learning

After completing this tutorial, you have the ability to build basic AI Agents. Here are directions for advanced learning:

1. Deep Dive into LangChain

  • Learn about Chain and LCEL (LangChain Expression Language)
  • Learn LangGraph for handling complex multi-step workflows
  • Use LangSmith for monitoring and debugging

2. Explore Other Frameworks

3. No-Code Solutions

4. Enterprise Applications

Congratulations on completing your first AI Agent! Continuous practice is the best way to learn—try modifying the code, adding new tools, and building your own AI assistant.

[CTA-ai]

Frequently Asked Questions

Why does my Agent keep calling the wrong tool?

This is usually because the tool's docstring description isn't clear enough. The LLM decides which tool to use based on the description, so make sure each tool's function description is clear and won't be confused with other tools. You can also add more guidance in the system prompt.

How much do API costs run?

Using the GPT-4o-mini model, typical development testing costs about $1-5 per day. After going live, it depends on usage. We recommend using cheaper models during development, then switching to better-performing models after confirming functionality.

How do I handle Agent hallucination issues?

Several methods: (1) Explicitly tell the Agent in the prompt "say you don't know if uncertain" (2) Require the Agent to answer only based on tool-returned results (3) Add fact-checking tools (4) Lower the temperature parameter for more stable output.

Can I use other LLMs?

Yes. LangChain supports multiple LLMs, including Anthropic Claude, Google Gemini, open-source models (like Llama), etc. You just need to change the corresponding package and initialization method. Different models have varying tool-calling capabilities and require testing.

How do I let the Agent access local files or databases?

Create corresponding tool functions. For example, create a tool for reading CSV files, a tool for querying SQLite databases. Pay attention to security—limit accessible paths and permissions to prevent the Agent from accessing sensitive data.

Need Professional Cloud Advice?

Whether you're evaluating cloud platforms, optimizing existing architecture, or looking for cost-saving solutions, we can help

Book Free Consultation

Related Articles