AI Agents

What are AI Agents?

An AI agent is a software system that uses a large language model as its reasoning engine to autonomously pursue goals. Unlike a chatbot that gives a single response, an agent operates in a loop — observing the world, deciding what to do next, executing actions, and iterating until the goal is achieved.

The defining property is agency: the ability to take consequential actions (call APIs, run code, browse the web, write files) rather than just produce text. This makes agents capable of multi-step workflows that previously required humans or complex hand-coded pipelines.

The Sense → Plan → Act Loop

Every agent architecture, regardless of framework, reduces to a core loop. The LLM acts as the brain — it interprets observations and decides which tool to invoke next:

User Query

→

LLM Reasoning

→

Tool Selection

→

Tool Execution

→

Response

The loop repeats — tool results feed back as new observations until the agent decides the task is complete.

Key distinction: A chatbot answers. An agent acts. The difference is tool use, memory, and the ability to plan across multiple steps autonomously.

Core Components

A fully capable agent is composed of five interacting subsystems. Understanding each helps you diagnose where agents fail and how to improve them.

👁️ Perception

Reads and interprets inputs from the environment — user messages, tool outputs, file contents, web pages, database results. The quality of perception sets a ceiling on everything else.

🧠 Memory

Short-term: the active context window — conversation history, recent tool results.
Long-term: a vector store (e.g. Chroma, Pinecone) that retrieves relevant memories via semantic search, enabling agents to recall information beyond the context window.

🗓️ Planning

The reasoning strategy the LLM uses to break down goals and sequence actions. Key techniques include Chain-of-Thought (CoT), ReAct (reason + act interleaved), and Tree-of-Thought (explore multiple branches before committing).

⚡ Action

The set of tools the agent can invoke: REST API calls, Python code execution, web search, file read/write, database queries, browser control, email, calendar. The richer the action space, the more capable — and more risky — the agent.

🔄 Reflection

A self-critique loop where the agent evaluates its own outputs against the goal. If the result falls short, it revises its plan and tries again. Reflection dramatically improves accuracy on complex tasks but increases latency and cost.

Agent Types

Agents are categorised by their planning strategy and how they coordinate work. Each type has distinct tradeoffs between reliability, capability, and cost.

Type	Description	Example	Best For
ReAct Agent	Interleaves reasoning (Thought) and action (Act) steps in a single loop. The LLM decides each next action based on accumulated observations.	LangChain ReAct agent answering research questions	Tasks that need dynamic tool selection; good general-purpose baseline
Plan-and-Execute	Separates planning from execution — a planner LLM produces a full task list, then an executor runs each step. More deterministic than ReAct.	LangChain Plan-and-Execute; BabyAGI	Multi-step workflows where the structure is known upfront
Multi-Agent	Multiple specialised agents collaborate — an orchestrator delegates subtasks to worker agents, collecting and combining their outputs.	CrewAI with Researcher + Writer + Editor agents	Complex tasks that benefit from specialisation and parallel execution
AutoGPT-style	Fully autonomous agents given a high-level goal and left to run indefinitely. The agent manages its own memory, spawns sub-tasks, and self-evaluates.	AutoGPT, AgentGPT	Open-ended exploration tasks; research on agent capabilities
Tool-calling Agent	Uses the model's native function-calling / tool-use API. The LLM outputs structured JSON to invoke tools rather than parsing free text.	OpenAI Assistants API; Anthropic tool use	Production systems — more reliable and easier to parse than free-text ReAct

Top Frameworks

The agent framework ecosystem matured rapidly in 2023–2025. Choose based on your use case: single-agent simplicity, multi-agent orchestration, or data-intensive retrieval workflows.

🦜 LangChain

The most popular framework. Provides chains, agents, and memory abstractions. Excellent for RAG (retrieval-augmented generation) combined with agents. Large ecosystem of integrations.

🤖 AutoGen

Microsoft Research's framework for multi-agent conversations. Agents communicate via messages, enabling complex collaborative workflows. Strong support for code-executing agents.

⚓ CrewAI

Role-based multi-agent framework. You define agents with explicit roles, goals, and backstories. A crew of agents divides tasks like a team. Intuitive abstraction for business workflows.

🦙 LlamaIndex

Data-focused agent framework built around indexing and retrieval. Best when agents need to reason over large document corpora. Deep integration with vector stores and structured data.

🔗 MCP

Model Context Protocol by Anthropic. An open standard for connecting LLMs to external tools and data sources via a uniform protocol. Enables agents to discover and call tools dynamically without custom integration code.

🪟 Semantic Kernel

Microsoft's SDK for integrating LLMs into .NET, Python, and Java applications. Strong enterprise focus with plugin architecture, memory, and planner components. Works with Azure OpenAI and Anthropic models.

Build Your First Agent

Before writing code, map out these four decisions. Getting them right saves significant debugging time:

1. Define the Goal

Write a precise task description. Vague goals lead to agents that loop forever or hallucinate completion. Good goal: "Research the top 5 competitors of Acme Corp and produce a comparison table."

2. Choose Your Tools

Start minimal — only add tools the agent actually needs. Each tool increases the action space and the chance of misuse. Common starter set: web search + code interpreter + file read/write.

3. Design the Memory

Decide what persists between runs. For short tasks, the context window is enough. For long-running agents, add a vector store and decide what gets embedded (tool results, summaries, facts).

4. Add Guardrails

Set a maximum step count to prevent infinite loops. Add confirmation checkpoints for irreversible actions. Log every step for debugging. Test on sandboxed environments before giving real-world access.

⚠️ Safety first: Agents with real-world tool access (email, file system, APIs) can cause real damage. Always run untested agents in sandboxed environments, implement step limits, and require human confirmation for destructive actions.

Code Example — LangChain Agent with Tools

A minimal ReAct agent using LangChain with web search and a calculator. The agent autonomously decides which tool to call based on the query.

Python

# pip install langchain langchain-openai duckduckgo-search
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_openai import ChatOpenAI
from langchain import hub
import os

# --------------------------------------------------
# 1. Configure the LLM
# --------------------------------------------------
llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
    api_key=os.environ["OPENAI_API_KEY"],
)

# --------------------------------------------------
# 2. Define tools the agent can use
# --------------------------------------------------
search = DuckDuckGoSearchRun()

def calculator(expression: str) -> str:
    """Evaluate a mathematical expression safely."""
    try:
        result = eval(expression, {"__builtins__": {}}, {})
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {e}"

tools = [
    Tool(
        name="web_search",
        func=search.run,
        description="Search the web for current information. Input should be a search query.",
    ),
    Tool(
        name="calculator",
        func=calculator,
        description="Evaluate mathematical expressions. Input: a Python math expression like '2 ** 10' or '100 / 4'.",
    ),
]

# --------------------------------------------------
# 3. Pull the standard ReAct prompt from LangChain Hub
#    and create the agent
# --------------------------------------------------
prompt = hub.pull("hwchase17/react")

agent = create_react_agent(llm=llm, tools=tools, prompt=prompt)

executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,      # prints Thought / Action / Observation steps
    max_iterations=8,   # prevent infinite loops
    handle_parsing_errors=True,
)

# --------------------------------------------------
# 4. Run the agent
# --------------------------------------------------
query = "What is the current population of Tokyo, and what is that number divided by 37?"

result = executor.invoke({"input": query})

print("\n=== Final Answer ===")
print(result["output"])

# Expected agent trace:
# Thought: I need to find Tokyo's population, then divide it by 37.
# Action: web_search
# Action Input: "Tokyo population 2025"
# Observation: Tokyo's population is approximately 13.96 million ...
# Thought: Now I can calculate 13960000 / 37
# Action: calculator
# Action Input: 13960000 / 37
# Observation: Result: 377297.297...
# Final Answer: Tokyo's population is ~13.96 million; divided by 37 ≈ 377,297.