Skip to main content

Core Concepts

This guide explains the fundamental concepts behind AutoGen: what agents are, how they work together in teams, how they use tools and models, and how the layered architecture provides flexibility for different use cases.

What is an Agent?

An agent is a software entity that:
  • Communicates via messages
  • Maintains its own state
  • Performs actions in response to messages
  • Can modify its state and produce external effects
# A simple assistant agent
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    system_message="You are a helpful assistant."
)

# Run the agent with a task
result = await agent.run(task="Explain quantum computing")
Agents in AutoGen are built on the Actor model, where each agent:
  • Processes messages independently
  • Maintains isolated state
  • Communicates only through message passing
  • Can create new agents or send messages to others
Think of agents as autonomous workers with specialized roles. They don’t share memory directly but collaborate by exchanging messages.

Types of Agents

AutoGen provides several preset agent types in the AgentChat API:

AssistantAgent

LLM-powered agentUses a language model to process messages and can call tools. Supports reflection on tool use and streaming.
agent = AssistantAgent(
    name="coder",
    model_client=model_client,
    tools=[code_tool],
    system_message="You write code."
)

CodeExecutorAgent

Safe code executionExecutes Python code in isolated environments (Docker or local). Returns results or errors.
agent = CodeExecutorAgent(
    name="executor",
    code_execution_config={
        "executor": DockerCommandLineCodeExecutor()
    }
)

UserProxyAgent

Human-in-the-loopRepresents human users in multi-agent workflows. Can request input or operate autonomously.
agent = UserProxyAgent(
    name="user",
    human_input_mode="ALWAYS"
)

Custom Agents

Build your ownImplement custom behavior by extending base classes or using the Core API.
class MyAgent(BaseChatAgent):
    async def on_messages(
        self, messages, cancellation_token
    ):
        # Custom logic
        return Response(...)

Agent Characteristics

All agents share these properties:
  • Name: Unique identifier within a team
  • State: Internal data maintained across messages
  • Message Handling: Logic for processing incoming messages
  • Actions: Operations performed in response to messages
Agents can:
  • Send messages to other agents
  • Call tools to interact with external systems
  • Generate responses using LLMs
  • Execute code or make API calls
  • Maintain conversation history

Multi-Agent Teams

A team is a group of agents working together toward a common goal. Teams implement multi-agent design patterns through coordinated message passing.

Why Teams?

Each agent handles a specific responsibility:
  • One agent writes code
  • Another reviews it
  • A third executes it
  • A fourth summarizes results
This is clearer than one agent doing everything.
Different agents can use:
  • Different models (GPT-4 for reasoning, GPT-4o-mini for simple tasks)
  • Different tools (one has web access, another has database access)
  • Different instructions specialized for their role
Agents can review each other’s work:
  • A writer agent creates content
  • A critic agent provides feedback
  • They iterate until quality is acceptable
This often produces better results than a single agent.
Multiple agents can work simultaneously:
  • Research agent gathers information
  • Analysis agent processes data
  • Visualization agent creates charts
Then results are combined by an orchestrator.

Team Types

AutoGen provides several preset team patterns:
Agents take turns in a fixed order. Simple and predictable.
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import TextMentionTermination

team = RoundRobinGroupChat(
    participants=[writer, reviewer, editor],
    termination_condition=TextMentionTermination("DONE")
)

result = await team.run(task="Write a blog post about AI")
Use when: You want deterministic turn-taking, like writer → reviewer → editor cycles.

Team Patterns

Common multi-agent design patterns:
  • Reflection: Primary agent generates, critic reviews, iterate until approved
  • Hierarchical: Orchestrator delegates to specialist agents
  • Sequential: Agents form a pipeline (research → write → edit → publish)
  • Parallel: Multiple agents work independently, results aggregated
  • Debate: Agents with different perspectives discuss to reach consensus
Start with a single agent and only move to teams when the task genuinely requires collaboration. Teams need more careful prompting and debugging.

Tools

Tools allow agents to interact with the external world beyond text generation. An agent with tools can:
  • Call APIs
  • Query databases
  • Execute code
  • Search the web
  • Read/write files
  • Control browsers

Function Tools

The simplest way to add tools is by passing Python functions:
async def get_weather(city: str) -> str:
    """Get current weather for a city.
    
    Args:
        city: Name of the city
        
    Returns:
        Weather description
    """
    # Call weather API
    return f"Weather in {city}: 72°F, sunny"

async def calculate(expression: str) -> float:
    """Evaluate a math expression.
    
    Args:
        expression: Mathematical expression to evaluate
        
    Returns:
        Numerical result
    """
    return eval(expression)

# Agent can use both tools
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    tools=[get_weather, calculate],
)
AutoGen automatically generates JSON schemas from function signatures and docstrings. Use type hints and descriptive docstrings for best results.

How Tools Work

  1. Agent receives a task: “What’s the weather in Seattle?”
  2. LLM decides to use a tool: Returns a function call request
  3. Framework executes the tool: Calls get_weather("Seattle")
  4. Result returned to LLM: “Weather in Seattle: 72°F, sunny”
  5. LLM generates response: “The weather in Seattle is currently 72°F and sunny.”
This is called function calling or tool use.

Advanced Tool Types

Model Context Protocol servers provide collections of tools:
from autogen_ext.tools.mcp import McpWorkbench, StdioServerParams

# Connect to Playwright MCP server for web browsing
server_params = StdioServerParams(
    command="npx",
    args=["@playwright/mcp@latest", "--headless"]
)

async with McpWorkbench(server_params) as mcp:
    agent = AssistantAgent(
        name="browser",
        model_client=model_client,
        workbench=mcp,  # Agent gets all MCP tools
    )
    
    result = await agent.run(
        task="Go to example.com and get the page title"
    )
Popular MCP servers:
  • @playwright/mcp: Browser automation
  • @modelcontextprotocol/server-filesystem: File operations
  • @modelcontextprotocol/server-postgres: Database queries
Security: Tools can execute arbitrary code or access sensitive systems. Only use trusted tools and validate inputs carefully. MCP servers should only be from trusted sources.

Models

Models are the LLMs that power agent reasoning and text generation. AutoGen uses a model client abstraction to support multiple providers.

Model Clients

All model clients implement the ChatCompletionClient interface:
from autogen_ext.models.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(
    model="gpt-4o",
    api_key="sk-...",  # Or use OPENAI_API_KEY env var
    temperature=0.7,
    max_tokens=2000,
)
Supported models:
  • gpt-4o - Latest multimodal model
  • gpt-4o-mini - Faster, cheaper variant
  • gpt-4-turbo - Previous generation
  • o1-preview - Advanced reasoning

Model Capabilities

Different models have different capabilities:
FeatureOpenAI GPT-4oClaude 3.5 SonnetGemini 2.0 FlashLocal (Llama 3.2)
Function Calling⚠️ Limited
Streaming
Vision (Images)
JSON Mode⚠️ Varies
Max Context128K200K1MVaries
Choose models based on your needs:
  • Complex reasoning: GPT-4o, Claude 3.5 Sonnet, o1
  • Speed/cost: GPT-4o-mini, Claude Haiku, Gemini Flash
  • Privacy: Local models (Ollama)
  • Long context: Claude (200K), Gemini (1M)

Layered Architecture

AutoGen’s three-layer design gives you flexibility to choose the right abstraction level:
1

Layer 1: Core API (Foundation)

Event-driven agent runtimeThe foundation layer provides:
  • Message-passing infrastructure
  • Agent lifecycle management
  • Topic-based pub/sub
  • Standalone and distributed runtimes
  • Cross-language support (Python ↔ .NET)
from autogen_core import SingleThreadedAgentRuntime, MessageContext
from autogen_core import RoutedAgent, message_handler

class MyAgent(RoutedAgent):
    @message_handler
    async def handle_message(self, message: str, ctx: MessageContext):
        # Custom event-driven logic
        await self.publish_message(response, topic="results")

runtime = SingleThreadedAgentRuntime()
await runtime.register("MyAgent", MyAgent)
await runtime.start()
Use when:
  • You need event-driven architecture
  • You want distributed execution across machines
  • You need cross-language agents (Python + .NET)
  • You want full control over message routing
2

Layer 2: AgentChat API (High-Level)

Intuitive defaults for rapid developmentBuilt on Core API with:
  • Preset agent types (AssistantAgent, CodeExecutorAgent)
  • Team patterns (RoundRobin, Selector, Swarm)
  • Termination conditions
  • Streaming helpers
  • Human-in-the-loop support
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat

# High-level, opinionated API
agent = AssistantAgent(
    name="assistant",
    model_client=model_client,
    tools=[weather_tool],
)

result = await agent.run(task="What's the weather?")
Use when:
  • You’re prototyping quickly
  • You want sensible defaults
  • You’re new to AutoGen
  • Common patterns fit your use case
3

Layer 3: Extensions API (Ecosystem)

Pluggable componentsExtensions provide:
  • Model clients (OpenAI, Anthropic, Azure, etc.)
  • Code executors (Docker, Jupyter)
  • Memory systems (ChromaDB, Redis)
  • Tool integrations (MCP, web browsing)
  • Custom implementations welcome
# Mix and match extensions
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.tools.mcp import McpWorkbench
from autogen_ext.code_executors import DockerCommandLineCodeExecutor

# All extensions work together
Use when:
  • You need specific model providers
  • You want tool integrations
  • You’re building custom components
  • You need production features (memory, caching)

Choosing the Right Layer

Start with AgentChat

Recommended for beginners
  • Quickest path to working agents
  • Best documentation and examples
  • Covers 80% of use cases
  • Easy to drop to Core API later

Drop to Core When Needed

For advanced scenarios
  • Custom message protocols
  • Event-driven patterns
  • Distributed execution
  • Cross-language requirements
Layer Philosophy: Start high (AgentChat), go deep (Core) when needed, extend freely (Extensions). You’re never locked in.

Message Flow

Understanding how messages flow is key to understanding AutoGen:

Message Types

  • TextMessage: Plain text from an agent
  • FunctionCall: Request to execute a tool
  • FunctionExecutionResult: Tool execution result
  • ToolCallSummaryMessage: Summary of tool calls
  • HandoffMessage: Transfer control between agents
  • TaskMessage: Initial task from user
  • StopMessage: Signal termination

Runtime Environments

AutoGen supports two runtime modes:
Single-process executionAll agents run in the same process. Simple and fast.
from autogen_core import SingleThreadedAgentRuntime

runtime = SingleThreadedAgentRuntime()
await runtime.register("agent1", Agent1)
await runtime.register("agent2", Agent2)
await runtime.start()
Use for:
  • Development and testing
  • Single-machine deployments
  • Simpler debugging
  • Lower latency
Agents work the same way in both runtimes. You can develop with standalone and deploy with distributed without changing agent code.

Next Steps

Now that you understand the concepts:

Build Multi-Agent Teams

Learn team patterns and collaboration

Work with Tools

Add function tools, MCP servers, and code execution

Configure Models

Set up different LLM providers

Explore Core API

Event-driven patterns and distributed runtime

Key Takeaways

  • Agents are autonomous entities that communicate via messages
  • Teams coordinate multiple agents using patterns like RoundRobin or Swarm
  • Tools let agents interact with external systems via function calling
  • Models provide LLM capabilities through a unified client interface
  • Layered architecture gives you flexibility: start simple, go deep when needed
  • Message passing is the foundation of all agent communication
  • Runtimes support both single-process and distributed execution
The best way to learn is by building. Try the Examples to see these concepts in action.