Advanced·5 min read

Agent (LLM Agent)

An LLM agent is a system where an LLM acts as the reasoning engine (the "brain") that dynamically plans actions, selects tools to use, executes those

Definition

An LLM agent is a system where an LLM acts as the reasoning engine (the "brain") that dynamically plans actions, selects tools to use, executes those tools, observes results, and continues reasoning until a goal is achieved. Unlike workflows (fixed scripts), agents decide their own next steps based on the current state.

The Core Agent Loop

`

[Goal] → [LLM: think + decide action] → [Execute action/tool] → [Observe result]

↑ |

└──────────────── [Continue until done] ←─────────────┘

`

This is called the ReAct loop (Reason + Act):

1. Reason: LLM thinks about what to do next

2. Act: Execute a tool or action

3. Observe: Receive the result

4. Repeat until the goal is reached or a termination condition is met

Agent Components

1. The LLM (Brain)

  • Receives: goal + current state + available tools + history
  • Produces: reasoning + next action to take
  • Examples: GPT-4o, Claude 3.5 Sonnet, LLaMA 3.1 (with tool use)
  • 2. Tools

    External capabilities the agent can invoke:

    | Tool Type | Examples |

    |-----------|---------|

    | Web search | Bing API, SerpAPI, Brave Search |

    | Code execution | Python REPL, Jupyter |

    | File I/O | Read/Write files, directory listing |

    | Database | SQL query, vector DB search |

    | APIs | Weather, stock prices, CRM |

    | Browsers | Playwright, Selenium for web automation |

    | Memory | Vector DB for long-term memory |

    | Other LLMs | Spawning sub-agents |

    3. Memory

  • Short-term: the current conversation / context window (active state)
  • Long-term: external storage (vector DB) persisted across sessions
  • Episodic: memory of past actions/outcomes
  • 4. Planner

  • Some agents have an explicit planning step before acting
  • Plan → decompose into sub-tasks → execute sub-tasks
  • Agent Architectures

    ReAct (Reason + Act)

    `

    Thought: I need to find the current weather in Paris.

    Action: search("Paris weather today")

    Observation: "Paris: 18°C, partly cloudy"

    Thought: I have the answer.

    Final Answer: It is currently 18°C and partly cloudy in Paris.

    `

    Plan-and-Execute

    1. Plan: generate a full plan of steps upfront

    2. Execute: execute each step (with another LLM call or tool)

    3. Replan: if needed, update the plan based on results

    Reflexion

  • After completing a task, the agent reflects on what went wrong
  • Updates its approach for the next attempt
  • Improves over iterations without weight updates
  • Multi-Agent (Multi-Agent Systems)

    Multiple specialized agents collaborate:

    `

    Orchestrator Agent

    ├── Research Agent (searches the web)

    ├── Code Agent (writes and runs code)

    ├── Review Agent (checks for errors)

    └── Writer Agent (produces final output)

    `

    Tool Calling / Function Calling

    Modern LLMs support structured tool calling:

    `json

    // Model outputs:

    {

    "tool": "search",

    "parameters": {"query": "latest iPhone price"},

    "reasoning": "I need to find the current price"

    }

    // Developer executes the tool and returns result to model

    `

    APIs: OpenAI Function Calling, Anthropic Tool Use, Bedrock Tool Use

    Agent Execution Frameworks

    | Framework | Description |

    |-----------|-------------|

    | LangChain Agents | ReAct and OpenAI Functions agents |

    | LangGraph | Graph-based stateful multi-agent systems |

    | AutoGen (Microsoft) | Multi-agent conversation framework |

    | CrewAI | Role-based multi-agent teams |

    | Autogen Studio | Visual multi-agent builder |

    | Claude Code | Developer agent with full tool access |

    | OpenAI Swarm | Lightweight multi-agent handoff |

    Agent vs. Workflow Summary

    | Dimension | Agent | Workflow |

    |-----------|-------|----------|

    | Control flow | Model decides | Developer defines |

    | Flexibility | High | Low |

    | Predictability | Lower | High |

    | Failure modes | Can loop, go off-track | Fail at defined points |

    | Cost | Variable (unknown # calls) | Predictable |

    | Best for | Open-ended tasks | Structured, repeatable tasks |

    Challenges and Failure Modes

    | Challenge | Description | Mitigation |

    |-----------|-------------|-----------|

    | Infinite loops | Agent keeps taking actions without terminating | Max step limits |

    | Tool misuse | Calls wrong tool or with wrong parameters | Strict tool schemas |

    | Hallucinating tool results | Makes up what a tool returned | Require actual execution |

    | Lost context | Forgets earlier observations in long tasks | Summarization, memory |

    | Cost explosion | Too many LLM calls | Budget limits, step caps |

    | Security risks | Agent executes dangerous code | Sandboxing, approval steps |

    Agent Reliability Techniques

  • Verification steps: have the agent check its own work
  • Human-in-the-loop: require approval for irreversible actions
  • Sandboxed execution: code runs in isolated environment
  • Structured outputs: enforce JSON for tool calls to prevent parsing errors
  • Step limits: max_iterations = 20 to prevent runaway agents
  • Related Concepts

  • Workflow, Tool Calling, ReAct, Multi-Agent, LangChain, LangGraph, Orchestration, Memory

Go Deeper With Live Instruction

This topic is covered in depth in our llm engineering program (Session 7).