Agent (LLM Agent) — FDE@ProdAI Blog

Definition

An LLM agent is a system where an LLM acts as the reasoning engine (the "brain") that dynamically plans actions, selects tools to use, executes those tools, observes results, and continues reasoning until a goal is achieved. Unlike workflows (fixed scripts), agents decide their own next steps based on the current state.

The Core Agent Loop

[Goal] → [LLM: think + decide action] → [Execute action/tool] → [Observe result]

↑ |

└──────────────── [Continue until done] ←─────────────┘

This is called the ReAct loop (Reason + Act):

1. Reason: LLM thinks about what to do next

2. Act: Execute a tool or action

3. Observe: Receive the result

4. Repeat until the goal is reached or a termination condition is met

Agent Components

1. The LLM (Brain)

Receives: goal + current state + available tools + history
Produces: reasoning + next action to take
Examples: GPT-4o, Claude 3.5 Sonnet, LLaMA 3.1 (with tool use)

2. Tools

External capabilities the agent can invoke:

| Tool Type | Examples |

|-----------|---------|

| Web search | Bing API, SerpAPI, Brave Search |

| Code execution | Python REPL, Jupyter |

| File I/O | Read/Write files, directory listing |

| Database | SQL query, vector DB search |

| APIs | Weather, stock prices, CRM |

| Browsers | Playwright, Selenium for web automation |

| Memory | Vector DB for long-term memory |

| Other LLMs | Spawning sub-agents |

3. Memory

Short-term: the current conversation / context window (active state)
Long-term: external storage (vector DB) persisted across sessions
Episodic: memory of past actions/outcomes

4. Planner

Some agents have an explicit planning step before acting
Plan → decompose into sub-tasks → execute sub-tasks

Agent Architectures

ReAct (Reason + Act)

Thought: I need to find the current weather in Paris.

Action: search("Paris weather today")

Observation: "Paris: 18°C, partly cloudy"

Thought: I have the answer.

Final Answer: It is currently 18°C and partly cloudy in Paris.

Plan-and-Execute

1. Plan: generate a full plan of steps upfront

2. Execute: execute each step (with another LLM call or tool)

3. Replan: if needed, update the plan based on results

Reflexion

After completing a task, the agent reflects on what went wrong
Updates its approach for the next attempt
Improves over iterations without weight updates

Multi-Agent (Multi-Agent Systems)

Multiple specialized agents collaborate:

Orchestrator Agent

├── Research Agent (searches the web)

├── Code Agent (writes and runs code)

├── Review Agent (checks for errors)

└── Writer Agent (produces final output)

Tool Calling / Function Calling

Modern LLMs support structured tool calling:

`json

// Model outputs:

{

"tool": "search",

"parameters": {"query": "latest iPhone price"},

"reasoning": "I need to find the current price"

}

// Developer executes the tool and returns result to model

APIs: OpenAI Function Calling, Anthropic Tool Use, Bedrock Tool Use

Agent Execution Frameworks

| Framework | Description |

|-----------|-------------|

| LangChain Agents | ReAct and OpenAI Functions agents |

| LangGraph | Graph-based stateful multi-agent systems |

| AutoGen (Microsoft) | Multi-agent conversation framework |

| CrewAI | Role-based multi-agent teams |

| Autogen Studio | Visual multi-agent builder |

| Claude Code | Developer agent with full tool access |

| OpenAI Swarm | Lightweight multi-agent handoff |

Agent vs. Workflow Summary

| Dimension | Agent | Workflow |

|-----------|-------|----------|

| Control flow | Model decides | Developer defines |

| Flexibility | High | Low |

| Predictability | Lower | High |

| Failure modes | Can loop, go off-track | Fail at defined points |

| Cost | Variable (unknown # calls) | Predictable |

| Best for | Open-ended tasks | Structured, repeatable tasks |

Challenges and Failure Modes

| Challenge | Description | Mitigation |

|-----------|-------------|-----------|

| Infinite loops | Agent keeps taking actions without terminating | Max step limits |

| Tool misuse | Calls wrong tool or with wrong parameters | Strict tool schemas |

| Hallucinating tool results | Makes up what a tool returned | Require actual execution |

| Lost context | Forgets earlier observations in long tasks | Summarization, memory |

| Cost explosion | Too many LLM calls | Budget limits, step caps |

| Security risks | Agent executes dangerous code | Sandboxing, approval steps |

Agent Reliability Techniques

Verification steps: have the agent check its own work
Human-in-the-loop: require approval for irreversible actions
Sandboxed execution: code runs in isolated environment
Structured outputs: enforce JSON for tool calls to prevent parsing errors
Step limits: max_iterations = 20 to prevent runaway agents

Related Concepts

Workflow, Tool Calling, ReAct, Multi-Agent, LangChain, LangGraph, Orchestration, Memory